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Abstract 

We derive a theoretical two-factor model which has empirically a similar explanatory 
power as the Fama-French three-factor model. In addition to the usual market risk, our 
model accounts for a diversification risk, proxied by the equally-weighted portfolio, and 
which results from an "internal consistency factor" appearing for arbitrary large economies, 
as a consequence of the concentration of the market portfolio when the distribution of the 
capitalization of firms is sufficiently heavy-tailed as in real economies. Our model rationalizes 
the superior performance of the Fama and French three-factor model in explaining the cross 
section of stock returns: the size factor constitutes an alternative proxy of the diversification 
factor while the book-to-market effect is related to the increasing sensitivity of value stocks 
to this factor. 

Introduction 

In the standard equilibrium and/or arbitrage pricing framework, the value of any asset is 
uniquely specified from the belief that only the systematic risks need to be remunerated 
by the market. This is the conclusion of the CAPM (Treynor 1961, Treynor 1999, Sharpe 
1964, Lintner 1965, Mossin 1966) and of the APT (Ross 1976, Roll and Ross 1980, Roll 
and Ross 1984, Roll 1994). Here, we show that, even for arbitrary large economies when 
the distribution of the capitalization of firms is sufficiently heavy-tailed as is the case of real 
economies, there may exist a new source of significant systematic risk, which has been totally 
neglected up to now but must be priced by the market. This new source of risk can readily 
explain several asset pricing anomalies on the sole basis of the internal-consistency of the 
market model. 

This result is based on two ingredients. The first one is the tautological internal consis- 
tency condition that the market portfolio, and any other factor that can be replicated by 
a portfolio of assets traded on the market, is constituted - by construction - of the assets 
whose returns it is supposed to explain. This internal consistency condition leads mechan- 
ically to correlations between the return residuals, as already stressed by Fama (1973) and 
Sharpe (1990, footnote 13) when the return on the market portfolio is considered as the only 
explaining factor, or by Chamberlain (1983) in the case where there exists several linearly 
independent portfolios that contain only "factor" variance and are therefore optimal for any 
risk-averse investor. These correlations are equivalent to the existence of at least one internal 
consistency factor (uncorrelated with the market and the other explanatory factors), which 
is a function of the weights of the market portfolio and of the portfolios replicating the other 

"The authors acknowledge helpful discussions and exchanges with M. Avellaneda, M. Brennan, X. Gabaix, 
M. Grinblatt, M. Meerschaert, V. Pisarenko, R. Roll, D. Zajdenweber, W. Ziemba and the seminar participants 
at New York University. All remaining errors are ours. 
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factors. The impact of this new factor is usually neglected away on the basis of the law of 
large numbers applied to well-diversified portfolios. 

Actually, when the distribution of the weights of the portfolios replicating the explaining 
factors - the distribution of the capitalization of firms in the case of the market portfolio, for 
instance - is sufficiently heavy-tailed, the law of large numbers, which is at the origin of the 
diminishing contribution of the residual risks to the total risk of "well-diversified portfolios" 
(Ross 1976, Huberman 1982), breaks down. Intuitively, whatever the size of the economy, 
the largest firms contribute idiosyncratic risks that can not be diversified. In this case, the 
generalized central limit theorem (Gnedenko and Kolmogorov 1954) shows that the impact 
of an internal consistency factor does not vanish even for infinite economies^. This may 
be the origin of a significant amount of risk for portfolios that would have been otherwise 
assumed "well-diversified" in its absence. As a consequence, when writing down for instance 
the APT, an additional explaining factor must be accounted for. 

This result must be contrasted with the many seminal papers deriving the APT and 
providing pricing bounds for finite economies. Indeed, following for instance Dybdig (1983) 
or Grinblatt and Titman (1983), among others, the residual risk of well-diversified portfolios 
resulting from the finiteness of the economy should be priced but the pricing error relative 
to a pure factor model disappeared in the limit of a large economy, as a full diversification 
of the non-systematic risk is achieved. In contrast, we find that the lack of diversification 
persists even when the number of traded assets is infinite. Beside, the generalization of 
Ross (1976) 's results provided by Chamberlain (1983) breaks down as a result of this lack 
of diversification. Indeed, Chamberlain (1983)'s results explicitly require that the risk of 
any sequence of portfolios bearing only residual risks converges to zero if the portfolios 
are well-diversified. Similarly, one cannot apply anymore Connor (1982)'s result that the 
APT pricing equation holds exactly if each asset has an infinitesimal weight in the economy. 
Indeed, in economies with a heavy-tailed distribution of firm sizes, the largest company has 
a size of the same order as the total size of all the companiefl These different remarks 
are in fact intimately entangled as will become clear in the sequel of this article. We stress 
that our results are driven by the fat-tailed nature of the distribution of the weights of the 
portfolios replicating the factors (when replication is possible), as occurs for the market 
portfolio when the distribution of firm sizes is heavy-tailed. Our results do not rely on any 
other distributional assumption concerning the explanatory factors or the disturbance terms. 
For simplification, we will assume that both the factors and the disturbance terms have finite 
variance, but it is simply for the convenience of the exposition of our results. They could 
easily be generalized to the case where factors and disturbance terms do not admit a finite 
second moment on the basis of the result established by Wang (1988), for instance. 

The introduction of our new "internal consistency factor," which basically accounts for the 
lack of diversification of the market portfolio, allows us to provide a theoretical explanation of 
several well-known pricing anomalies. In particular, the relevance of the two effects studied 
by Fama and French (1992, 1993, 1995), namely the small-firm effect (first documented 
by Banz (1981)) and the book-to-market ratio, can be understood from and rationalized 
within the theoretical framework of the ATP when the "internal consistency factor" , and 
its associated diversification premium, is accounted for. Thus, our model bridges the gap 
between Fama and French phenomenological model and the arbitrage pricing theory. More 
precisely, our model provides an understanding of the superior performance of Fama and 
French's three- factor model in explaining the cross section of stock returns. Indeed, the 
new internal consistency factor provides a rationalization of the size factor as a proxy of the 

1 In a different context, Gabaix (2005) has proposed that the same kind of argument can explain that idiosyn- 
cratic firm-level fluctuations are responsible for an important part of aggregate shocks, and therefore provide a 
microfoundation for aggregate productivity shocks. Indeed, as in the present article, it is suggested that the tradi- 
tional argument according to which individual firm shocks average out in aggregate breaks down if the distribution 
of firm sizes is fat-tailed, as documented empirically. 

2 This simply results from the large deviation theorem on heavy-tailed distribution according to which, given 
TV iid random variables Si, . . . , Sn with a fat tailed distribution, we have (Embrechts et al. 1997) 

Pr [max(Si, . . . Sn) > x] _ ^ 
x^L Pr [Si H 1- S N > x] ~ 
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internal consistency factor. Besides, consistent with the fact that high book-to-market stocks 
have significantly lower beta's with respect to the market portfolio compared with low book- 
to-market stocks (Bernardo et al. 2007), the book-to-markct effect also emerges naturally 
from our formalism. In the context of the on-going debate (Lakonishok et al. 1994, Daniel 
et al. 2001) on the interpretation of the two empirical effects analyzed by Fama and French 
(1993), we provide an explanation with solid economic underpinning. 

The article is organized as follows. In the next section, we synthesize the available empir- 
ical evidence on the fat-tailed nature of the distribution of firm sizes and their consequence 
on the lack of diversification of the market portfolio. Then, in section 2, we make clear 
the consequences of the internal consistency condition mentioned above; due to the internal 
consistency condition, we show that the disturbance terms must obey a condition which de- 
termines their correlation. Next we presents our main results on the asymptotic behavior of 
the variance of well-diversified portfolios: we show that, together with the market risk, there 
is an additional source of systematic risk resulting from the internal consistency condition. 
This additional risk may be of the same order as the market risk even for infinite economies 
when the distribution of the capitalization of companies is sufficiently heavy-tailed. Section 
3 confirms, by use of numerical simulations, the relevance of the concentration effect for 
markets with a realistic number of traded assets. Then, it discusses the consequences for the 
arbitrage pricing of financial assets, providing an expression that accounts for the premium 
required by the investors to bear this systematic "internal consistency" risk and we propose 
proxies for the empirical assessment of this risk premium. It allows us to provide theoretical 
economic explanations of some of the empirical factors reported in the literature while an 
empirical analysis shows that, on the basis of only two factors (the market portfolio and 
the equally- weighted portfolio), our model is at least as successful as the Fama and French 
three-factor model over the period Jan. 1927 to Dec. 2005 for the US market data available 
on Professor French's websit^f). Section 4 summarizes our results and draw some conclusion. 

1 The distribution of firm sizes and the concentration 
of the market portfolio 

The study of the distribution of firm sizes benefits from a rich history. Zipf (1949) made 
an important early contribution by establishing that US corporation assets approximately 
followed the law s(n) ~ 1/n (now referred to as the Zipf's law): when sizes are ranked from 
the largest to the smallest, Zipf's law states that the firm size s(n) of the n th largest firm is 
inversely proportional to its rank n. Inverting this relation, we have that the rank of the n th 
largest firm is inversely proportional to its size n ~ l/s(n) which is nothing but the sample 
complementary cumulative distribution of the Pareto law with a tail exponent ji = 1. 

Zipf's law seems to be a robust property of business firms0 (Ijri and Simon 1977). Indeed, 
several proxies for the size of companies have been used which recover the same robust 
results that the exponent fi is equal or close to 1: assets, market capitalizations, number of 
employees, profits, revenues, sales, value added and so on (Axtell 2001, Axtell 2006, Gabaix 
et al. 2006, Marsili 2005, Simon and Bonini 1958). Beside, Ramsdcn and Kiss-Haypa (2000) 
have analyzed the distribution of firms by revenues in 20 countries in America, Asia and 
Europe and report an exponent fi ranging from 0.44 to 1.25 with a median value equal to 
0.85. 

Several models have attempted to provide explanations for the distribution of firm sizes, 
in terms of the law of proportional effect (Gibrat 1931, Simon and Bonini 1958), of economies 
of scale and costs reduction (Bain 1956, Robinson 1961), of the distribution of managerial 
talents and efficient allocation of productivity factors across managers (Lucas 1978), or of 
the partition of the set of workers (Axtell 2006), among others. But, only recently, the 
closeness of the exponent fi to the value 1 has been justified from a simple argument proposed 
by Gabaix et al. (2006). They have transposed the mechanism given for cities (Gabaix 
1999a, Gabaix and Ioannides 2004, and the references therein) to firm sizes and mutual 

http: //mba. tuck. dartmouth. edu/pages/f acuity /ken. f r ench/dat a_library.html 

Other social entities, such as cities, share this property (Zipf 1949, Gabaix 1999a, Gabaix 1999b, Gabaix and 
Ioannides 2004). 
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fund capitalizations: starting from the traditional argument based upon Gibrat's law of 
proportional effect, whereby firm growth is treated as a random process and growth rates 
are independent of firm size, a log- normal process modified with small perturbations to ensure 
convergence to a non-degenerate steady-state distribution yields a power law distribution. 
The value of its exponent \i — 1 then results from the condition that the average normalized 
size of firms stays constant in a stationary economy. 

Consubstantial with the fat tailed character of the distribution of firm sizes is the concen- 
tration of the market portfolio. Indeed, the market portfolio, defined as the value-weighted 
portfolio of all the assets traded on a given market suffers from an inherent lack of diver- 
sification, resulting from the fat tail distribution of firm sizes, in the sense that only a few 
dozen of companies account for a very large part of the overall market capitalization. For 
instance, the top ten largest companies of the US market represents about one fifth to one 
fourth of the US market capitalization. 

More generally, given an economy of N firms, whose sizes Si, i = 1, . . . ,N, follow a Pareto 
law with tail index /j,, the ratio of the capitalization of the largest firm to the total market 
capitalization 

max Si . . 

Rn = g W 

l^i=l a i 

which is nothing but the weight of the largest company in the market portfolio, behaves on 
average like 

E [Rn] — 0, if /* > 1, (2) 

E[1/R N ] — ► — !— , if M <l, (3) 
1 ji 

as the number of firms N goes to infinity (Bingham et al. 1987). 

This result means that when the distribution of firm sizes admits a finite mean, the weight 
of the largest firm in the market portfolio goes to zero, and so do the weights of any other 
firms, in the limit of a large market. In terms of asset pricing, the fact that the weight of each 
individual firm in the economy is infinitesimal ensures that the APT pricing equation holds 
for each asset and not only on average (Connor 1982). In contrast, when the distribution 
of firm sizes has no finite mean, the asymptotic weight of the largest firm in the market 
portfolio does not vanish, illustrating the fact that for such an economy, the market portfolio 
is not well diversified, all the more so the smallest the tail index /j,. A practical consequence 
is then that the APT pricing equation, if it holds, only holds on average, with possibly large 
pricing errors for individual assets. 

In order to get a closer look at the concentration of the market portfolio, we focus on its 
Hcrfindahl index, which is perhaps the most widely used measure of economic concentration 
(Polakoff 1981, Lovett 1988), 

JV 

H N = \\w m \\ 2 = J2 w ™,i , ( 4 ) 
i=i 

where w m ^ denotes the weight of asset i in the market portfolio whose composition is given 
by the A^-dimcnsional vector w m . The Herfindahl takes into account the relative size and 
distribution of the firms traded on the market. It approaches zero when the market consists 
of a large number of firms with comparable sizes. It increases both as the number of firms in 
the market decreases and as the disparity in size between those firms increases. Our use of 
the Herfindahl index is not only guided by common practice but also by its superior ability to 
provide meaningful information about the degree of diversification of an unevenly distributed 
stock portfolio (Woerheide and Persson 1993). Following tradition, we say that a portfolio 
is well-diversified, if its Herfindahl index goes to zero when the number N of firms traded in 
the market goes to infinity. 

For illustration purpose, let us first concentrate on an economy where the sizes, sorted in 
descending order, of the N firms are deterministically given by 

Si,* - (±) V " • (5) 
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We have arbitrarily chosen the size of the smallest firm as equal to one. Alternatively, one 
can think of 5j,jv as the size of the i th largest firm relative to the size of the smallest one. 
With this simple model, the rank i of the i th largest company is directly proportional to its 
size taken to the power of minus /i, as it should in order for the distribution of sizes to obey 
a Pareto law with a tail index equal to fi. It is easy to check that the weight of the largest 
firm in the market portfolio goes to zero, as A goes to infinity, when fi is larger than or equal 
to one while it goes to some positive constant when \i is less than one. More precisely, we 
have 

w m ,i — ► 0, if n > 1, (6) 
«W — — ^, ifM<l, (7) 

where £(•) denotes the Riemann zeta function (Abramovitz and Stegun 1972, p. 807). 
For the Herfindahl index, one gets 



M>2, 



In TV + 7 
4A 



+ (A- 3/2 lnTV) , fi = 2, 



( V) ^ C(2/M) ' A^W + , K M < 2, (8) 
2+0 (A- 1 ( 7 + In TV)- 2 ), M =l, 



6 (7 + In A) 

c(2/m) +o(n 1 - 1 ^) u<1 
. C(l//i) a V /' ' 

In accordance with the behavior of the weight of the largest firm, Hn goes to zero when 
the index /i is larger than or equal to one, while it goes to some positive constant otherwise. 
However, the decay rate of -ffjv toward zero becomes slower and slower as [i approaches 
1 (from above). In practice, when the number of traded firms is large - but finite - the 
concentration of the market portfolio can remain significant even if ji is larger than one, 
specifically when \x lies between one and two. 

In order to illustrate this situation, the upper panel of figure [1] depicts the value of the 
weight of the largest firm in the market portfolio while the lower panel shows the Herfindahl 
index as a function of /i. The plain curves show the limit situation of an infinite economy 
while the dotted and dash-dotted curves account for the effect of a finite economy: the dotted 
curve refers to the case where only one thousand companies are traded while the dash-dotted 
curve corresponds to an economy with ten thousand firms. Clearly, finite economy size effects 
cannot be neglected for market sizes as found in the real economy. 

[Insert Figure]]] about here] 

To be a little bit more general, we now consider the case where the firm sizes are randomly 
drawn from a power law distribution of size. By application of the generalized law of large 
numbers (Feller 1971, Gnedenko and Kolmogorov 1954, Ibragimov and Linnik 1975) and 
using standard results on the limit distribution of self-normalized sums (Darling 1952, Logan 
et al. 1973), we can state that 

Proposition 1. The asymptotic behavior of the concentration index Hjy is the following: 

1. provided that E[S' 2 ] < 00, 

H N = ±^ + o p (l/N), 

2. provided that S is regularly varying with tail index /1 = 2 and s^ ■ Pr [S > s] — > c as 
s —>■ oo, 

c InA f 1 
H N = 7777^2 ~TT + °p 



e[sy 



A In TV 
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3. provided that 5 is regularly varying with tail index /i £ (1,2) and s 11 • Pr [S > s] — ► c as 
s — > oo. 



AT 



7TC 



2r(f)sinif 



2/p 



E[5] 2 N*-V 

where £/v is a sequence of positive random variables with stable limit lau^ 5(/U/2, 1), 
4- provided that 5 is regularly varying with tail index fi = 1 and s M • Pr [5 > s] — > c as 



oo, 



H 



N 



1 



where £at is a sequence of positive random variables with stable limit law 5(1/2,1), 
5. provided that S is regularly varying with tail index \i € (0, 1) and s M • Pr [5 > s] — > c as 



oo, 



H 



N 



Ala 



2/M 



6r 

c 



2 ' 



where £n cind Cat are two sequences of strongly correlate^ positive random variables 
that converge in law to S(n/2, 1) and S(fi, 1) respectively, 

6. provided that S is slowly varying, 



H 



N 



1, a.s. 



As a consequence of the fourth statement of the proposition above, for economies in 
which the distribution of firm sizes follows Zipf 's law (jj, = 1) the asymptotic behavior of the 
concentration index Hn of the market portfolio is given by 



H 



N 



2 • (In TV)" 



(9) 



where £jy is a sequence of positive random variables with stable limit law 5(1/2, 1), namely 
the Levy law with density 



/(*) = 



1 



x 3/2 e , x> 0. 



(10) 



This shows that, even if the concentration of the market portfolio goes to zero in the limit of 
an infinite economy, it goes to zero extremely slowly as the size N of the economy diverges. 
Accounting for the fact that the median value of the Levy law (JTUJ) is approximately equal 
to 2.198, a typical value of Hn is 4 — 5% for a market where 7000 to 8000 assets are tradecH, 
which is much higher than the concentration index of a well-diversified portfolio - typically 
the equally-weighted portfolio - which should be of the order of 0.012 — 0.014%. Intuitively, 
Hm ~ 4 — 5% means that there are only about 1/H n ~ 20 — 25 effective assets in a typical 
portfolio supposedly well-diversified on 7000 — 8000 assets. 

This simple illustrative example shows, roughly speaking, that the market portfolio re- 
flects the behavior of the 20 to 25 largest assets traded on the market. In this context, one 
can wonder (i) how the market portfolio alone could explain the expected return on any 



5 The stable law S(a,f3) has characteristic function lp a ,/3{s) = 



with 



exp [-|s|" +is/3tan^|s| a - 1 ] a ^ 1, 
exp [— \s\ — is/3§ - Ins] a = 1, 

/3e[-i,i]. 

6 More precisely, the sequence of random vectors (£n,Cn)' converges to an operator-stable law with stable 
marginal laws S((j,/2, 1) and S(fJ., 1) respectively, and a spectral measure concentrated on arcs ±(x,x 2 ). The full 
characterization of the spectral measure is beyond the scope of this article (see (Meerschaert and Scheffler 2001, 
Section 10.1) for details). 



l-F(tx) 



= 1, for all 



7 The random variable S is slowly varying if its distribution function F satisfies lima; 
t > 0. It corresponds to the limit case where S is regularly varying with fj, — > 0. 

8 These figures are compatible with the number of stocks currently listed on the Amex, the Nasdaq and the 
Nyse. 
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asset, irrespective of its size, as predicted by the CAPM and (ii) if it is actually optimal 
for a rational investor to put her money in this risky portfolio alone, as proposed by the 
two-fund separation theorem. This suggests that the lack of diversification of the market 
portfolio is responsible, to a large extent, for the failure of the CAPM to explain the cross- 
section of stock returns. This failure has been documented in particular by Fama and French 
(1992, 1993), who find basically no support for the CAPM's central result of a positive rela- 
tion between expected returns and the global market risk (quantified by the beta parameter) . 
This therefore raises the question of the existence of a concentration premium. 

Many authors have proposed alternative or additional factors in the quest to cure the 
deficiencies of the CAPM and provide explanations for the so-called pricing anomalies. Three 
main classes of additional factors can be distinguished: macro-economic factors, firm-specific 
factors and behavioral factors. 

Macro- economic factors. The positive or negative impact on stock prices of macro- 
economic factors such as interest rates (Chen et al. 1986), exchange rates (Harvey 1991, 
Ferson and Harvey 1993), real output (Culter et al. 1989, Chen et al. 1986), inflation and 
money supply (Bodie 1976, Fama 1981, Geske and Roll 1983, Pearce and Roley 1983, 
1985), aggregate consumption (Jagannathan and Wang 2007, and references therein), 
oil prices (Chen et al. 1986, Ferson and Harvey 1993, Jones and Kaul 1996), labor 
income (Jagannathan and Wang 1996, Reyfman 1997) and so on, has been underlined 
in many studies based on the APT (Ross 1976, Roll and Ross 1984, Roll 1994) or in the 
context of equilibrium (Burmeister and Wall 1986, Flannery and Protopapadakis 2002). 

Firm-specific factors. The fact that industry sector groupings may be important in 
the study of the return generating process has been stressed for a long time (King 
1966, Alexander and Francis 1986). Similarly, the importance of market capitalization 
(or small-firm effect) has been documented in the early eighties by Banz (1981) and 
Reinganum (1981) while Stattman (1980) and Rosenberg et al. (1985) underlined the 
role of the book-to-market ratio. If other ratios such as the earnings-to-price ratio (Basu 
1977) and the dividend yield (Blume 1980, Rozcff 1984, Keim 1985) for instance, also 
predict future returns, most of the attention has been drawn to the size and the book-to- 
market effect during the past decade as a result of their superior performance to explain 
the cross-section of stock returns (Fama and French 1992, 1993,1995,1996). Among 
various interpretation of the explaining power of the size and the book-to-market ratio, 
Campbell and Vuolteenoha (2004) and Campbell et al. (2005) have considered breaking 
the beta of a stock with the market portfolio into two components, one reflecting 
news about the market's future cash flows and one reflecting news about the market's 
discount rates in order explain the size and value "anomalies" in stock returns. 

Behavioral factors. Two major issues have been considered. On the one hand, Ru- 
binstein (1973) and Krauss and Litzenberger (1976) have proposed to account for the 
departure of the distributions of returns from normality and for the sensibility of the 
investors for the skewness and kurtosis of the distribution of stock returns. The rele- 
vance of this approach has been underlined by Lim (1989) and Harvey and Siddiquc 
(2000) who have tested the role of the asymmetry in the risk premium by accounting 
for the skewness of the distribution of returns. Along the same line, many other exten- 
sions have been presented such as the VaR-CAPM (Alexander and Baptista 2002), in 
order to account more carefully for the risk perception of investors. On the other hand, 
several studies have developed phenomenological models capturing the reversal of long- 
term returns (Chan 1988, Chopra et al. 1992, DeBondt and Thaler 1985, 1987) and the 
continuation of short-term trends (Chan et al. 1996, Jegadeesh and Titman 1993, Je- 
gadeesh and Titman 2001, Richards 1999). 

Most of these factors actually provide a significant improvement in explaining the cross- 
section of asset returns. However they do not provide a clear identification of the most 
prominent ones. Even if the Fama and French three factor model is now widely recognized 
as the benchmark, the reasons for its superiority in explaining the cross-section of asset 
returns are still debated. It is in this context that we propose to focus on the consequences 
of the undisputable fact that the market portfolio is highly concentrated on a small number 
of very large companies and therefore can obviously not account for the behavior of the 
smallest ones. As we are going to demonstrate, this will allow us to rationalize the size 
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effect, in relation with what we propose to call a "diversification factor," which, to some 
extent, also justify the relevance of the book-to-market factor. 

2 Internal consistency conditions of factor models and 
their consequences on diversification 

Under the assumption that the return on the market portfolio is a factor explaining the 
return on individual assets, our demonstration is based on two ingredients. 

• The internal consistency condition states that the market portfolio is made of the 
assets whose returns it is supposed to explain. As a consequence, there are correlations 
between the disturbance terms. 

• The lack of diversification of the market portfolio (associated with the fat tail dis- 
tribution of firm sizes) make these correlations non-negligible, giving birth to an ad- 
ditional factor which significantly contributes to the asymptotic variance of a priori 
well-diversified portfolios. 

2.1 The factor model 

Consider an economy with N firms whose returns on stock prices are determined according 
to the following equation 

r = a + m ■ [r m - E [r m ]] + B$ + e, (11) 

where 

• r is the random N x 1 vector of asset returns; 

• a = E [r\ is the N x 1 vector of asset return mean values. We do not make any 
assumption neither on the ex-ante mean- variance efficiency of the market portfolio, nor 
on the absence of arbitrage opportunity, so that a is not, a priori, specified; 

• r m is the random return on the market portfolio; 

• (3 m is the JV x 1 vector of the factor loadings of the market factor; 

• is the random N x 1 vector of risk factors fa which are assumed to have zero mean 
(E [ft i] = 0), unit variance, are uncorrelated with each other and with r m ; 

• B is the N x q matrix of factor loadings; 

• e is the random JV x 1 vector of disturbance terms with zero average E [e\ = and 
covariance matrix SI — E [e* • e] . The disturbance terms are assumed to be uncorrelated 
with the market return r rn and the factors fa. 

It would be natural to assume that (i) SI is diagonal in order to have the i th contribution of 

e embodying the specific risk contribution to the i th asset but, as we shall see in the sequel, 

the internal consistency condition makes this impossible and forces the disturbances e to be 

correlated. A weaker hypothesis on SI would be that (ii) all its eigenvalues are uniformly 

bounded from above by some constant A (i.e., the bound is independent of the size of the 

economy: ViV, max x'Qx < A). This implies that the covariance matrix of the stock returns 
11*11=1 

defined as 

£ = E [(f - a) (f - a)'] = (3(3' ■ Var r m + BB' + ft, (12) 

where the prime denotes the transpose operator, has an approximate q + 1 factor struc- 
ture, according to the definition in Chamberlain (1983) and Chamberlain and Rothschild 
(1983). But these two assumptions (i) and (ii) are in fact equivalent, as shown by Grinblatt 
and Titman (1985). Indeed, a simple repackaging of the N security returns into N new 
returns constructed by forming N portfolios of the primitive assets allows one to get a new 
formulation of expression (jTTJ) with mutually uncorrelated disturbance terms. 

To understand why the disturbance terms cannot be uncorrelated, let us first denote by 
w m the vector of the weights of the market portfolio. Accounting for the fact that the market 
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factor is itself built upon the universe of assets that it is supposed to explain, the model must 
necessarily fulfill the internal consistency relation 

r m = w' m ■ r. (13) 

Left-multiplying (jlip by w' m , the internal consistency condition (|13[) implies the following 
relationship 



wL ■ (3-1 



(r m - E [r m ]) + v/ m B<f> + w' m -e = . (14) 



Then, by our assumption of absence of correlation between r m , (f> and e, it follows trivially 
thatH 

w' m ■ e = almost surely, (15) 

while 

w' m - P=l and w^S = . (16) 

Several authors have pointed out a consequence of the internal consistency condition 
that the market portfolio is made of (or can be replicated by) the assets they are intended 
to explain (Fama 1973, Sharpe 1990). An a priori important consequence of this internal 
consistency condition is the breakdown of the standard assumption of independence (or, at 
least, of the absence of correlation) between the non-systematic components of the returns of 
securities. In other words, the standard factor model decompositions assume that the distur- 
bance terms for security i are uncorrelated with the comparable components for security j. 
But, this cannot be strictly the case as can be seen from the above derivation. This presence 
of correlations between the disturbance terms may a priori pose problems in the pricing of 
portfolio risks: only when the disturbance terms can be averaged out by diversification can 
one conclude that the only non-diversifiable risk of a portfolio is born out by the contribution 
of the market portfolio which is weighted by the beta of the portfolio under consideration. 
Previous authors have suggested that this is indeed what happens in economies in the limit 
of a large market N — > oo, for which the correlations between the disturbance terms vanish 
asymptotically and the internal consistency condition seems irrelevant. For example, while 
Sharpe (1990, footnote 13) concluded that, as a consequence of equation (fT5"|) , at least two of 
the disturbances, say and Sj, must be negatively correlated, he suggested that this prob- 
lem may disappear in economies with infinitely many securities. Actually, we show below 
that this apparently quite reasonable line of reasoning does not tell the whole story: even for 
economies with infinitely many securities, when the companies exhibit a large distribution 
of sizes as they do in reality, the constraint (|15|) can lead to the important consequence that 
the risk born out by an investor holding a well-diversified portfolio does not reduce to the 
market risk in the limit of a very large portfolio, as usually believed. A significant proportion 
of "specific risk" may remain which cannot be diversified away by a simple aggregation of a 
very large number of assets. 



2.2 Correlation structure of the disturbance terms 

The fact that the disturbance terms e in the market model (fTTj) are correlated according to 
the condition (fTS"]) means that there exists at least one common "factor" / to the e's, so that 
e can be expressed as 

e = j-f + ff, (17) 

where 7 is the vector of loading of the factor The factor / could be chosen a priori 

such as to explain one of the many anomalies reported in the previous section. But, as 
recalled, we want to move away from this logic of invoking macro-economic, firm-specific 

9 Right multiplying equation (|14p by e* and taking the expectation, given that the return on the market 
portfolio, the factors (f> and the disturbance terms e are uncorrelated, we obtain that w' m -O = 0. Then, right 
multiplying w' m ■ Q, = by w m gives = w' m ■ Q, ■ w m — w' m ■ E[ee'] • w m = E[(w' m ■ e) ■ (w' m ■ e)'] = E[[u;™ ■ e] 2 ], 
hence the result (|15[) . 

10 With this representation, we avoid the case where the explaining factor - here the market portfolio - could 
be replicated by a single traded asset. Indeed, in such a case, the replicating portfolio would be concentrated on 
one single asset, say the first one, so that the internal consistency condition would read ei = without any other 
constraint on the e,, i > 1. 
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or behavioral factors. We prefer to focus on the parsimonious single market factor model, 
and just account for the lack of diversification of the market portfolio which calls for a 
diversification premium. As a bonus, we will see that this strategy turns out to provide 
a fundamental basis of explaining a significant part of the pricing anomalies. Our only 
requirement is that the covariance matrix of e exhibits an eigenvalue that goes to infinity in 
the limit of an infinite economy, when Hn does not go to zero. In contrast, when Hjq goes 
to zero as N — > oo, the largest eigenvalue should remain bounded. This requirement derives 
simply from the results of Chamberlain (1983) and Chamberlain and Rothschild (1983), 
who have linked the existence of K unbounded eigenvalues (in the limit N — > oo) of the 
covariance matrix of the asset returns to a unique approximate factor structure, such that 
the K associated eigenvectors converge and play the role of K factor loadings. 

For simplicity, we choose fj to be a vector of uncorrelated residuals with zero mea 
Since w' m e = 0, / and fj are not independent from one another. More precisely, we have 

/ = -M (18) 

provided that w' m "f ^ 0; if not, the random vector fj would have to satisfy w' m ff = 0, 
which contradicts our assumption of an absence of correlations between the components of 
fj. Therefore, in this framework, / is not actually a factor - it should be uncorrelated with 
fj if it was - but is rather an "endogenous" factor. The market model (fTTj) then becomes 

r = a + $- [r m - E [r m ]] + j-f + rj, (19) 

with 

• Cov (r m , /) = Cov (r m , fj) — 0, as the result of the absence of correlation between r m 
and e, 

• Var fj — A, where A is a diagonal matrix, 

• Var t — ™ ^,>2 , and 

. Cov(f,ff) = -^-vr m A. 

In order to understand and illustrate the relevance and the limits of the assertion accord- 
ing to which the existence of correlations between two disturbance terms e, and £j should 
be negligible in an infinite size market (Fama 1973, Sharpe 1990), let us now evaluate their 
typical magnitude. To simplify the notations, let us rescale without loss of generality the 
vector 7 by w' m j, so that the relation (fT%)) becomes 

/ = -rt m ti, (20) 
with w m j — 1. The covariance matrix f2 of e'is 

il = (w' m Aw m ) ff - fw^A - Aw m f + A, (21) 
and the correlation between Ei and Ej (i ^ j) is 

_ (w' m Aw m ) "fi-fj - JjWmjAjj - JjWrn^AiA 

Pij — , ■ 



^J[{w' m Aw m ) 7? - 2jiW m ^A u + An] ■ [(w' m Aw m ) 7? - 27,10.. 



For illustration purpose, let us assume that all the jiS are equal to one (the condition 
w' m ^j = 1 is then automatically satisfied from the normalization of the weights w m ) and that 
An = A for all i's. The cumbersome relation (|2"2")l simplifies into 

pij = , (23) 

a/(1 + H N - 2w m .i) (1 + H N - 2w m .j) 



l + H 



N 



{l + 0(w mMj) /H N )) . (24) 



lx It should be enough to assume that all the eigenvalues of the covariance matrix of fj are positive and uniformly 
bounded by some positive constant (Grinblatt and Titman 1983). 
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Then, expression ([24]) shows that, provided that the market portfolio is sufficiently well- 
diversified, namely provided that the weight of each asset and the concentration index goes to 
zero in the limit of a large market (N — > oo), the correlations ptj between any two disturbance 
terms goes to zero as usually assumed. However, the largest eigenvalue of the correlation 
matrix, associated with the (asymptotic) eigenvector 1 = (1,1,..., 1)', is A max .iv — N ■ i+jj 
and goes to infinity, as the size of the economy growths unbounded, as soon as Hn goes to 
zero more slowly than 1/N . This clearly shows that the correlations between the disturbance 
terms are not necessarily negligible. 

The question, that we now have to address, is whether these weak correlations may 
challenge the usual assumption that well-diversified portfolios do not bear additional non- 
diversified sources of risks. For this, let us consider a well diversified portfolio i.e., a 
portfolio such that ||u> P || 2 — * as the size of the economy goes to infinity. From equation 
(121|) . the residual variance of this portfolio, namely the part of the variance of the portfolio 
that cannot be ascribed to systematic risk factors, reads 

w' p £lw p = (w m Aw m ) (fwp) 2 - 2 (w m Aw p ) (fw p ) + w'pAw'p . (25) 

In addition to our previous hypothesis that A is a diagonal matrix, we assume that its 
entries are uniformly bounded from below by some positive constant c± and from above by 
some constant ci < oo and that \^w' p \ is uniformly bounded from below by some positive 
constant c' and from above by some finite constant c" (this is the case, for instance, when 
one considers 7=1, which is compatible with the requirement w' m ■ j — 1 assumed in the 
representation (f2"Tjl ). Then 

w' p Aw' p < c 2 -||uy| 2 ^0, (26) 
\{w' m Awp){i'w p )\ < c 2 ■ c" ■ \\w m \\ ■ \\w p \\ ^ 0, (27) 

and 

ci • d ■ \\w m \\ 2 < (vf m Aw m ) (iw'p) 2 < c 2 ■ c" -WwmW 2 , (28) 

so that 

w' p Q, W p ~ K ■ H N , K > 0, as N -> oo. (29) 

Therefore, the residual variance w'Q,w p of any "well-diversified portfolio" w p goes to zero, as 
the size N of the economy goes to infinity, if and only if the concentration index Hn of the 
market portfolio goes to zero. In the case of a real economy, section [1] has shown that the 
Hcrfindahl index Hn of the market portfolio goes to zero but at the particularly slow decay 
rate of l/(ln7V) 2 . As a consequence, the residual variance may still account for a significant 
part of the total portfolio variance. We will give a numerical example in the next paragraph 
providing a more precise statement concerning the behavior of the residual variance of the 
equally-weighted portfolio. 



2.3 Asymptotic behavior of the variance of the excess return of the 
equally-weighted portfolio 

In order to investigate more precisely the impact of the correlations between the disturbance 
terms induced by the condition of internal consistency on the variance of the returns of a 
"well-diversified" portfolio, we consider first the simple case of the equally-weighted portfolio 
whose composition is given by the vector w e = -hi. Algebraic manipulations yield 

V„ ,. = A > ■ V„ Tm + f K . + £ U £ A,.) , (30, 

(Ei=i S i7iJ 2^=1^7* \ i= i / 

where r e denotes the return on the equally-weighted portfolio and f3 e its beta with the market 
factor. We have reintroduced the explicit dependence on the term Wm,iYi ( no more 

assumed to be scaled to the value 1) and have explicited the fact that the market weight of 
firm i is w mji = SJ^ =1 Si- 
Two of the four terms in the right-hand-side (r.h.s.) of expression (|30|) are standard. The 
first term (3 e 2 ■ Var r m is the traditional contribution of the market risk factor weighted by 
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the beta of the portfolio. The last rightmost term in the r.h.s. of ([30]) represents the usual 
contribution of the diversifiable risk of the portfolio when one assumes that the disturbance 
terms are uncorrelated and therefore represents the specific sources of risk. The two other 
terms are new and result from the existence of correlations between the disturbances. In the 
absence of such correlations, 7 would be zero and these two terms disappear. 

Assuming that the An 's are N iid (positive) random variables with finite expected value 

E [An] < 00, we get that ^ J2iLi &a and 7f (jN • j^^" ) are ®v (V-W)j irrespective of 
the fact that the distribution of firm sizes admits or does not admit a finite mearf^l. This 
implies that, in the limit of large N, the third and fourth terms in the r.h.s. of expression 
(|30f can be neglected, leading to 



Var r e = (3 e 2 ■ Var r m + f N ■ ^= x 1 " + O p (l/N). (31) 

(E*Li Sin) 

The fact that the fourth term in expression (|30[) disappears in the limit N — > 00 is not 
surprising since it recovers the standard result on the diversification of the idiosyncratic 
risks. More interestingly, the fact that the third term in ([3U| also goes to zero as 1/N 
means that it does not introduce (in the limit of a large market) an additional risk worth 
considering. 

Proposition [2] below reveals through expression (|3Tj) that the only significant additional 
contribution to the risks of the equally-weighted portfolio stems from the term 

-2 l^i=l °i ~M /O0\ 

N ' /V" <? ? ' 

(E<=i SiH ) 

which is nothing but the variance (conditional of the 7i's and the Si's) of the term 7^ • / 
resulting from the expression of the market model (|19p . 

By the same kind of arguments as in Proposition [TJ we get that the contribution ([3^]) 
exhibits three different behaviors. Either the variance of the distribution of firm sizes is 
finite and the term (|32p goes to zero has 1/N, or only the mean of the distribution of firm 
sizes exists and the term (I3"2l goes to zero at a much slower rate or, finally, if the mean of 
the distribution of firm sizes does not exist, the additional risk term (|32|) converges to some 
finite positive value. More precisely, we can state the following results: 

Proposition 2. Assuming that the 7^ 's are iid random variables such that E [I7I] < 00, and 
that the An 's are iid positive random variables such that E[A^] = A < 00, the asymptotic 
behavior of the variance of the equally-weighted portfolio is the following: 

1. provided that E[S' 2 ] < 00. 

Var r e = (3 e 2 ■ Var r m + O p (l/N), 

2. provided that S is regularly varying with tail index /1 = 2 and s^ ■ Pr [S > s] — > c > 0, 
as s goes to infinity, 

Var r e = (3 e 2 ■ Var r m + -^2^\T + °p( 1iiN / N )> 

3. provided that S is regularly varying with tail index /i£ (1,2) and s M - Pr [S > s] — > c > 0, 
as s goes to infinity, 



Var r e = f3 e 2 ■ Var r m + 



ttcV [A^/ 2 ] 1 1 " 



2r(f)sin^ 



6v + o p 



E[S} 2 N 2 ~ 2 /^ Siv T p Va 2 - 2 /p 
where is a sequence of positive random variables with stable limit law S(/i/2,l), 
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The term within the parentheses converges in law either to zero, if E[S] < 00, or to some non degenerated 
distribution, if S is regularly varying with tail index less than one. 
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4- provided that S is regularly varying with tail index [1=1 and s M • Pr [S > s] — ► c > 0, 
as s goes to infinity, 

Var r e = /3 e 2 • Var r m + 1 J 7 2 2 - ■ £n + o p 1/ In 2 A 7 ") , 

^ E [|7|J m A/ 

where £at is a seguence of positive random variables with stable limit law 5(1/2,1), 

5. provided that S is regularly varying with tail index fi £ (0, 1) and s M -Pr [S > s] — ► c > 0, 
as s goes to infinity, 



Var r e = /3 e -Var r m +E 



Aj"/2 



2/f E [7] 5 



E [|7|^] 2/A1 7rl/f< 



r|i±^)cos^ 



2/n 



S.JV 



where £jv and are two sequences of strongly correlate!!^ positive random variables 
that converge in law to S(fi/2, 1) and 5(/i,/3 7 ) with /3 7 = — — ^En^prp — hc<°l respec- 
tively. 

Focusing on the case where fi is equal (or close) to one, as in real markets, proposition [2] 
tells us that the asymptotic behavior of the variance of the equally weighted portfolio is given 

by 



Var r= 



f3 e ■ Var r m + 



7r • E [7] • E [A 1 / 2 ] ' 
2.E[| 7 |] 2 -(lniV) 2 



Z N + o p (l/(lnN)'' 



(33) 



where £n is a sequence of positive random variables with stable limit law S(l/2, 1) whose 
density is given by (fl~0|) . Expression (|33|) implies that the variance of the equally- weighted 
portfolio, while asymptotically proportional to the variance of the market portfolio, receives 
a significant contribution due to the internal consistent condition together with the Zipf 
distribution of company sizes. This additional contribution decays to zero extremely slowly 
with the number N of companies in the economy. For instance, (i) assuming that the variance 
Ajj of the residuals r\i is the same for all of them and, a priori, of the same order as the 
variance of the market return: A„- ~ Var r m , (ii) considering that the ratio JrWra is of 



e[| 7 i: 



the order of one and (iii) accounting for the fact that the median value of the Levy law is 

■Var r m . 



(In AT)' 



approximately equal to 2.198, the additional term is typically of the order of 
So, assuming that (3 e is about one and considering a market where 7000 to 8000 assets are 
tradecQ, the typical amplitude of the additional term represents 5% of the total variance of 
the equally-weighted portfolio. More precisely, in one case out of two, the contribution of 
the additional term is larger than 5% of the total variance. Figure presents the probability 
to reach or exceed a given level for the contribution of the residual variance to the total 
variance, in an economy with 7000-8000 traded assets. In one case out at four (p = 0.25), 
the contribution of the residual variance to the total variance is larger than 15%; in one case 
out ten (p = 0.1), it represents more than 50%. 



[ Insert Figured about here] 



2.4 Relation with the concentration of the market portfolio 

The variance of the term ■ f given by (|32p cannot be easily related to observable market 
variables since it is a mixture of the firm sizes (which are observable) and of the not directly 
accessible underlying variables 7;'s and A«'s which describe the correlation structure of the 
disturbances e in the model pT|) . Nevertheless, as a consequence of the assumption that 

the 7,'s and A^'s have finite expectations, the behavior of the term . '= 1 — 1 — ^ is the same 



13 see footnote [6] 

14 These figures are compatible with the number of stocks currently listed on the Amex, the Nasdaq and the 
Nyse. 



13 



as that of , „ =1 — W which is nothing but the Herfindahl index Hn of the market portfolio 

(£ i= i Si) 



since 



In fact, propositions [T] and are closely related. Loosely speaking, these two propositions 
can be summarized as follows 



law 



Var r e ~ /3 e 2 • Var r m + A M • H N , (35) 



where 



E [A] , fi > 2, 

E ' 

E 



A m/2 



VP E [7] 2 



| M ]2/M 



1</i<2 ' (36) 
M < 1. 



E[|7l' 

Expression (|35|) has a simple intuitive meaning based upon the standard interpretation 
of the Herfindahl index as the inverse of the effective number of assets of a portfolio, if 
this portfolio was well-diversified (in fact, equally- weighted) . Indeed, considering an equally- 
weighted portfolio made of n assets, its Herfindahl index is H = 1/n. Conversely, given a 
portfolio whose Herfindahl index is H, its effective number of assets, defined as the number 
of assets of an equally-weighted portfolio with the same value H of the Herfindahl index, is 
n e ff =1/H. Therefore, considering that the real market is not made of N (~ 7000 — 8000) 
assets but actually of N e ft = 1/Hn (~ 20— 25) effective assets, equation (1331) expresses the 
variance of the equally-weighted portfolio as the sum of two terms: the first one gives the 
variance of the portfolio resulting from the exposition to the market risk 01 ■ Var r m while 
the second one represents the residual variance of the N e ft = 1/ Hn assets. The constant 
appears as the average residual variance of the N e ff assets. Thus, when the market 
portfolio is well-diversified, Hn goes to zero, or equivalently, the number of effective assets 
goes to infinity so that, by virtue of the law of large numbers, the residual variance K^/N e ff 
goes to zero. In contrast, when the market portfolio is concentrated on a few assets, Hn 
does not go to zero, the number of effective assets remains finite in the limit of an infinite 
economy and the residual variance does not go to zero. 

For illustration purpose, we discuss in turn three cases. First, both propositions [T] and 
[D show that the concentration index Hn and the variance of / are of the order of 1/N, 
like the last two terms in the r.h.s. of expression ([3H|) . provided that the variance of the 
distribution of firm sizes is finite. As a consequence, for such distributions of firm sizes, the 
market portfolio is well diversified insofar as the concentration index is of the same order as 
the inverse of the number of assets in the portfolio. As a consequence, there is no additional 
non-diversifiable risk and, in the limit of a large market, we have 

Var r e = (3 2 e ■ Var r m + O p ( N~ 1 ) . (37) 

Let us consider the example of a distribution of firm sizes given by a Gamma law T(r, A). In 
such a case, it is well-known that the joint distribution of {u^m.i}^^ 1 ^ s a multivariate Beta 
law with parameter r (Mosimann 1962), which yields 

in accordance with the fact that H n = jj + o p (l/N). 

Second, if the distribution of firm sizes admits only a finite mean value and, in addition, 
is regularly varying at infinity with a tail index /i 6 (1>2), the propositions [1] and [5] state 
that both the concentration index and the variance of / are of the order of l/jV 2 ^ 1-1 /^. As 
a consequence, the contribution to the total risk due to the second term in the r.h.s. of ([30]) 
decays to zero much slower than the decay ~ 1/N of the two last terms. Then 

Var r e = 01 ■ Var r m + £— r - + O p (N- 1 ), for some C > 0. (39) 

tv 2 ! 1 --) 
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As an example, if the tail index of the distribution of firm sizes is /i = 3/2, the ratio of the 
second term in the r.h.s. of ((30|) over the last two terms is of the order of N 1 / 3 . Therefore, 
assuming that the prefactors of these contributions have the same magnitude, the second 
term is typically 10, 21 and 46 times larger than the last two terms, if one thousand, ten 
thousands and one hundred thousand companies are traded on the market. 

Finally, if the distribution of firm sizes does not even admit a finite mean value but is 
still regularly varying at infinity with a tail index /i € (0, 1), propositions [T] and [2] show that 
the Hcrfindahl index and the variance of / converge to non-degenerated random variables 
which are the ratio of two positive and dependent stable random variables: 



c2 j l c2 

H N = H + o p (l), with H = lim — l — N 2 , (40) 



v ... 2 .., i ,. An • Si + 1- A NN • Sff 

Var / = a j + 0p (l), with a f = lmi — — - , (41) 

N->oo ( 7l ■ 5i H h JN ■ S N ) 



so that 



Varr e = f e ■ Var r m + E [ 7 f • +o p (l). (42) 

specific market risk non-diversified risk 

The first term in the r.h.s. of (I42|) is the non-diversifiable market risk which is remunerated 
by the market according to the CAPM formula. The second term clearly exemplifies the fact 
that due to (i) the dependence between the e resulting from the internal consistency condition 
and (ii) the Pareto form of the distribution of the size of companies, full diversification cannot 
occur even in the limit of a market with an infinite number of assets. Consider the example 
where the distribution of firm sizes is the Levy law defined by equation (|10[) . Using its 
properties of stability under convolution, the distribution of the market weights w m j can be 
easily obtained. For instance, the density of the marginal law of Wi is given by 

N-l W -l/2(l_ w) l/2 

9nM = — ■ 1 + [{N _ 1)2 _ 1]W • ( 43 ) 
so that E [Hff] — \ ■ jir^y in agreement with the fifth statement of proposition [1] and (|4U)l . 



2.5 Generalization to arbitrary well-diversified portfolios 

The detailed results obtained until now in section [2] refer to one particular portfolio, the 
equally-weighted portfolio. This portfolio is interesting because it is often taken as a reference 
and as a starting point to more elaborate allocations by analysts and practitioners. However, 
from the previous sections, it seems natural to conjecture that the results summarized in 
proposition [2] also hold (with suitable adaptation) for the entire class of well-diversified 
portfolios as suggested by equation ([29]) . By well- diversified portfolio is meant a portfolio of 
iV assets whose concentration index goes to zero in the limit of large N. In the particular 
case where we consider a portfolio p, with weight on asset i given by w p ^ = cti/N , where the 
a,i 's have to sum up to N in order to ensure that the sum of the fractions of wealth invested 
in each asset is equal to one and such that 2i=i a i 1S uniformly bounded from above by 
some finite constant, the Herfindahl index of p behaves as 

C 

H p ,n ~ jj, asiV^oo, (44) 
where C is a positive and finite constant. Then, the variance of portfolio p reads 

Var r p = f3 p 2 ■ Var r m + E [ 7 ] 2 • ^ 1 " + o p (l) , (45) 

(E*Li #7<) 

by virtue of the law of large numbers. 

This expression shows that the term ^ yj (or equivalently the concentration index 

Hm of the market portfolio) still controls the decay (or the absence of decay) of the con- 
tribution to the variance in addition to the variance associated with the correlation of the 



15 



portfolio p with the market portfolio. Therefore, we conclude that proposition [2] holds for 
the entire class of portfolios whose Hcrfmdahl index decays to zero as C/N, for large N. In 
fact, the result holds for this class of long portfolios, i.e. such that the weights ai/N sum 
up to one. In the case of an arbitrage portfolio, namely a portfolio whose weights ai/N sum 
up to zero, no additional term appears in the variance (|45p . 

Finally, when the concentration index of the portfolio under consideration goes to zero, 
but at rate slower from 1/N, obtaining a detailed result for the variance of the portfolio's 
return involves more complex formulas. For the present work, equation (|29[) is sufficient to 
state that, in general, well-diversified portfolios, of which the equally-weighted portfolio is 
just an example, have generally a non-diversified risk which does not vanish in the limit of 
large economies, if the distribution of firm capitalizations is sufficiently heavy-tailed. There- 
fore, holding a portfolio with asymptotically vanishing Herfindahl index does not necessarily 
diversify away the non-systematic risk. 

3 Discussion 

3.1 Analysis of synthetic markets generated numerically 

In order to assess the impact of the internal consistency factor in real stock markets of finite 
size, we present in table [1] the results of numerical simulations of synthetic markets with 
respectively N = 1000 and N = 10000 traded assets. We construct the synthetic markets 
according to model (|19[) so that the only explicit explaining factor is the market factor and 
the size distribution of the capitalization of firms is the Pareto distribution 

Pr [S > s] = \ ■ l s>1 . (46) 

We investigate various synthetic markets characterized by different tail index [i, from [i = 1/2 
(deep in the heavy-tailed regime), pL = 1 (borderline case often referred to as the Zipf law 
when expressed with sizes plotted as a function of ranks) to fi = 2 (for which the central 
limit theorem holds and standard results are expected). It is important to stress that the 
results presented in table [1] are insensitive to the shape of the bulk of the distribution of firm 
sizes, and only the tail Pr [S > s] ~ s~ M , for large s, matters. 

The three values of the tail index \i equal to 2, 1 and 1/2 correspond to the three major 
behaviors of the residual variance of a "well-diversified" portfolio, namely the part of the 
total variance related to the disturbance term e only, given by proposition [21 

• for /i = 2, the residual variance goes to zero as 1/N, so that the market return should 
be the only relevant explaining factor if the the number of traded assets is large enough; 

• for pi = 1, the residual variance goes very slowly to zero, so that one can expect a 
significant contribution to the total risk and a strong impact of the internal consistency 
factor / for large (but finite) market sizes; 

• for fj, = 1/2, the residual variance does not go to zero and one can expect that the 
contribution of the residual variance to the total risk remains a finite contribution as 
the size of the market increases without bounds. 

For each value p, — 2, p = 1 and p = 1/2, we generate 100 synthetic markets of each 
size N = 1000 and N = 10000 (hence a total of 3 x 2 x 100 synthetic markets). For each 
market, we construct 20 equally weighted portfolios (randomly drawn from each market) and 
we regress their returns on the returns of the market portfolio (r m ), on the returns of the 
market portfolio and of the internal consistency factor (r m , /), on the returns of the market 
portfolio and of the (overall) equally weighted portfolio (r m , r e ), on the returns of the market 
portfolio and of an arbitrary under-diversified portfolio (r m ,r u ) and on the returns of the 
market portfolio and of an arbitrary well-diversified arbitrage portfolio (r m ,r a ). Using the 
100 market simulations for each case (jj,, N), Table [T] summarizes the mean, minimum and 
maximum values of the coefficient of determination R 2 of these five regressions of the 20 
equally weighted portfolios. 

[Insert Table\j\ about here] 
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For fj, = 2, as was expected, the market return is the only relevant factor: it accounts on 
average for about 95% and 99% (for N = 1000 and N = 10000 assets, respectively) of the 
total variance of the 20 equally-weighted portfolios under considerations. The fact, that the 
explained variance increases from 95% to 99% when going from N = 1000 to N = 10000 
assets, results from the standard diversification effect: for N — 1000, each of the 20 equally- 
weighted portfolios are made of only 1000/20=50 assets compared with 10000/20=500 assets 
for N = 10000. As a confirmation, the minimum and maximum values of the R 2 remains 
very close to their respective mean values. 

For (i=l, the market factor explains a much smaller part of the total variance compared 
with the previous case (80% and 88%, respectively for N — 1000 and N = 10000 assets). 
As expected, this effect is stronger for the markets with the smallest number N = 1000 of 
traded assets. In addition, the minimum R 2 (1% and 20%, resp.) departs strongly from 
its mean value. Besides, the regression on the market factor and the internal consistency 
factor / (which is readily accessible in the case of a numerical simulation) provides a level of 
explanation (95% and 99%, respectively) comparable to that of the case (i — 2 for which full 
diversification of the residual risk occurs. Moreover, the equally-weighted portfolio provides 
the same level of explanation as / itself. This is particularly interesting insofar as / is not 
observable in a real market while the return on the equally-weighted portfolio can always 
be calculated, or at least proxied. We find more generally that any well-diversified portfolio 
provides overall the same explaining power. This result is simply related to the fact that the 
internal consistency factor / is responsible for the lack of diversification of " well-diversified" 
portfolios (when (i < 1) so that the return on any "well-diversified" portfolio p reads r p ~ 
oip + f3 p ■ r m +E [7] • /. This suggests that the equally- weighted portfolio or any well-diversified 
portfolio , in so far as it is strongly sensitive to the internal consistency factor /, may act as 
a good proxy for this factor. 

In contrast, the regression on any under-diversified portfolio, while improving on the 
regression performed just using the market portfolio, remains of lower quality: the gain 
in R 2 is only 5-6% on average with respect to the regression on the sole market portfolio, 
while the gain in R 2 lies in the range 10-15% when using the equally-weighted portfolio. 
Finally, table [1] shows that the introduction of an arbitrage portfolio does not improve the 
regression. This is due to the fact that arbitrage portfolios arc not asymptotically sensitive 
to the internal consistency factor / in the large N limit, as recalled in section 11- 12. 51 

The same conclusions hold qualitatively for synthetic markets generated with (i = 1/2, 
with the important quantitative change that the explanatory power of the market factor 
does not increase with the market size N . This expresses the predicted property that the 
internal consistency factor / should have an asymptotically finite contribution to the residual 
variance as the size of the market increases without bounds. 

Finally, our numerical tests confirm that the distributional properties of the 7's (the 
factors loading of the residuals on the internal consistency factor /) have no significant 
impact on the results of the simulation, provided that E [I7I] < 00. 

3.2 Consequences for the Arbitrage Pricing Theory and the stan- 
dard pricing anomalies 

In his article establishing the arbitrage pricing theory, Ross (1976, p. 347) explicitly assumes 
that the disturbance terms in the factor model (|11[) are "mutually stochastically uncorre- 
cted," which is inconsistent with the constraint (fT5")) if we assume that the factors (or at 
least some of them) can be replicated by assets portfolios. Indeed, the derivation of the 
APT results from the construction of a well- diversified arbitrage portfolio (step 1 in Ross 
(1976, p. 342)) chosen so as to have no systematic risk (step 2). The fact that this arbitrage 
portfolio is well-diversified is important because it is at the basis of the argument for the 
diversification of the specific risk of the arbitrage portfolio in the limit of a large number of 
assets (law of large numbers), which conditions the results of steps 3 and 4 in Ross (1976). 
Unfortunately, as shown in section II-E, if one of the factors can be replicated by a portfolio 
whose weights are distributed according to a sufficiently fat-tailed distribution, the specific 
risk of this portfolio cannot be diversified away even if it is a well-diversified portfolio, as 
defined in section II-E. In that case, the conclusion resulting from steps 3 and 4 in Ross 
(1976) breaks down. 
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Alternatively, we can say that the residual risks exhibit too strong correlations. This 
problem has been tackled by many authors. In particular, Chamberlain (1983) and Cham- 
berlain and Rothschild (1983) have developed the appropriate formalism to deal with it, 
while Stambaugh (1982) and Ingersoll (1984) have provided sharp pricing bounds in the 
presence of correlation between the error terms. Basically, when all the eigenvalues of their 
covariance matrix remains bounded as more and more assets are added to the market until 
its size goes to infinity, the ATP holds. In contrast, when several eigenvalues grow without 
bound, the factors associated with these eigenvalues must be split off from the residuals 
and considered as new explaining factors that should be priced. This argument is at the 
basis of the choice of the specification (fTT|) of the dependence structure of the disturbances 
of our market model. Therefore, if we explicitly include our additional internal consistency 
risk factor / in the analysis, the original derivation of Ross' results still holds, as shown by 
Chamberlain (1983). Indeed, a key technical assumption for the APT to hold is that the 
£i's (in equation (|11| ) arc "sufficiently independent to ensure that the law of large numbers 
holds" (Ross 1976, p. 342) and, as explained in the previous sections, this condition breaks 
down. Nonetheless, this condition holds for the residuals rji defined by equations (|17H19p . 
Then, for the one factor model (fl"9|) . the following result holds: 

Proposition 3. Consider a market where N assets are traded and for which the internal 
consistency condition \15\) holds, so that the returns of the set of assets obey the following 
dynamics: r — E [r\ + [3 ■ [r m — E [r m ]] + 7* • f + f) , where f is the (zero-mean) additional 
factor resulting from the internal consistency condition and r m is uncorrelated with f and 
the centered disturbance vector if. Then, under the usual assumptions required for the APT 
to hold, the expected return on asset i satisfies 

E [n - r ] = A • E [r m - r ] + (^ - j m ■ A) • E [r lcc - r ] , (47) 

where r denotes the risk free interest rate and E [r icc ] > r is the expected return on any 
portfolio Wi cc such that w' icc ■ [3 — 0, with unit exposure to the factor f - i.e. such that 
w[ cc '7 = 1- and which is well-diversified in the sense that the variance Var (wi CC ■ if) goes 
to zero as the number N of assets goes to infinity. "f m = w' m ■ 7 is the gamma of the market 
portfolio. The index i cc refers to the "internal consistency condition. " 

The proof of this result proceeds as follows. Starting from the model (I19|) and following 
step by step the demonstration of theorems I and II in Ross (1976), we get the asymptotic 
result 

E[r] = /9 f+A 1 /3 + A 2 7, (48) 

where p, Ai and A2 are three non-negative constants. Their values are determined by ex- 
pressing the expected return on the market portfolio W m , on the portfolio Wi CC and on any 
well-diversified portfolio without any systematic risk. This leads to identifying p with ro, 
A 2 with E [r lcc - r ] and Ai with — 7 m • E [r lcc - r ] - r . The quantity j m = w' m ■ 7 never 
vanishes, due to the internal consistency constraint of the model. 

Two comments are in order. Firstly, expression (|47[) looks like a standard APT decom- 
position of the risk premia of the expected return of a given asset i weighted by their factor 
loading, except for one important feature: the risk premium due to the internal consistency 
factor has its amplitude controlled by the factor loading 7$ (as usual) corrected by the un- 
usual term —j m (3i- In a standard factor decomposition, it is always convenient to impose 
7,„ = w' m .j = so that the contribution to the total risk premium due to any factor is 
proportional to its corresponding factor loading 7$. In the factor decomposition including 
the internal consistency factor, this is intrinsically impossible, as we have stressed above. In 
this sense, expression (|4"T)) is not the result of a standard factor decomposition. It is however 
the correct decomposition for a one factor model in the presence of the internal consistency 
condition, which may lead to the creation of the new internal consistency factor. The later 
should in fact be referred to as an endogenous factor. This decomposition leading to (|4"7| is 
the correct one in particular to highlight the crucial consequence of the internal consistency 
condition in the contribution of the endogenous factor to the total risk premium of a given 
asset. As we shall see, the fact that the factor loading /3, of the market portfolio contributes 
to the amplitude of the risk premium due to the endogenous factor provides an interesting 
interpretation of the book-to-market effect. 
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Secondly, in the case where the market portfolio is well-diversified, the contribution of 
the additional risk factor / vanishes asymptotically so that the risk premium associated with 
this risk factor goes to zero in the limit of an infinitely large market. 

3.3 Empirical consequences 

The pricing formula given by proposition [3] offers an interesting new insight into the valu- 
ation of asset prices. However, the direct assessment of the risk premium associated with 
the internal consistency risk factor ICC is not possible because we do not have a priori 
access to it, so that the practical implementation of this theoretical framework seems prob- 
lematic. Nonetheless, if we recall that the risk premium associated with the additional 
term (7.; — j m ■ 0i) ■ E \ri cc — 7*0] is due to the lack of diversification of the so-called "specific 
risk," and that well-diversified portfolios such that the equally-weighted portfolio are par- 
ticularly sensitive to this risk, it seems natural to consider the return on this portfolio in 
order to probe the market price of the non-diversified risk. Besides, the numerical simula- 
tions presented in section l3~Tl testify to the relevance of this choice. However, insofar as the 
equally-weighted portfolio is (by construction) strongly correlated with the market portfolio, 
it can be desirable to consider instead the arbitrage portfolio made of a long position in the 
equally-weighted portfolio and of a short position in the market portfolio. This arbitrage 
portfolio constitutes our proxy for the ICC risk factor and we denote by ri CC (t) the time 
series of its returns. Therefore, this reasoning applied to proposition [3] leads us to estimate 
the following regression model 

n,t - r = on + fa • [r m {t) - r ] + 0\ CC ■ r lcc (t) + e<(.) . (49) 

In order to assess the explaining power of the new factor, we also include in the regression 
model the two factors SMB and HML of Fama and French (see Fama and French (1993) 
for the description of the construction of these two portfolios). We use the monthly excess 
returns of twenty- five equally- weighted portfolios sorted by the quintiles of the distribution of 
sizes and book-to-market values and the returns of ten value-weighted and equally-weighted 
industry portfolioJ^l. Tables [2] to [7] present our results for the period from Jan. 1927 to Dec. 
2005. 

[Insert Tables\^ and\3\ about here] 

Table [2] presents the parameter estimates of the multi-linear time series regression of the 
excess monthly returns of 25 equally- weighed portfolios (sorted by quintiles of the distribution 
of sizes - Small, 2,3,4 and Big - and by quintiles of the distribution of Book equity to Market 
equity ratio - Low, 2, 3, 4 and High) regressed on the excess return on the market portfolio, 
on the two Fama- French factors SMB and HML and on the proxy ICC for the additional risk 
factor due to the internal consistency constraint given by the difference between the return 
on the equally-weighted portfolio and the return on the market portfolio: 

r ilt - r = cti + V [r m (t) - r ] + 0\ CC ■ r lcc (t) + 0? MB ■ r smb (t) + 0? ML ■ r hml (t) + e t (t) . (50) 

The figures decorated by one star (resp. two stars) show the cases which reject the null 
hypothesis that the factor under consideration is not significant in the presence of the others 
at the 5% (resp. the 1%) level. Clearly, the three factors SMB, HML and ICC are, almost 
always, significant at the 1% level, suggesting that it is a priori useful to consider these three 
factors together. The regressions on the four factors provide a very good explanation of the 
portfolios excess returns, as witnessed by the i? 2 's which are larger than, or of the order of, 
90% for most portfolios, except for three extreme cases: Small-Low, Small- 2 and Big-High. 

15 We have used the monthly data available on Professor French's website: 
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/25_Portfolios_5x5.zip for the 25 portfo- 
lios sorted by size and book-to-market, 

http: //mba. tuck. dartmouth. edu/pages/f acuity /ken. french/f tp/10_Industry_Port folios . zip for the ten 

industry portfolios and 

http: //mba. tuck, dartmouth. edu/pages/f acuity/ken. french/f tp/F-F_Research_Data_Factors . zip for the 

market factor, the risk-free interest rate and the two factors SMB and HML. 
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However, these conclusions must be tempered in view of the results summarized in tablc[3] 
which gives the R 2 of the various regressions of the multi-linear times series of the monthly 
excess returns of these 25 equally- weighed portfolios on the market portfolio (Rm), on the 
market portfolio and the factor ICC (ICC), on the market portfolio and the size factor (SMB), 
on the market portfolio and the book to market factor (HML), on the market portfolio and 
the two Fama&French factors (HML + SMB), on the market portfolio, the factor ICC and 
the size factor (ICC + SMB), on the market portfolio, the factor ICC and the book to market 
factor (ICC + HML) and, finally on all four factors (market, ICC, SMB and HML). The 
numbers in boldface represent the maximum value of the R 2 within the group of regression 
with two factors (columns ICC, SMB and HML) and with three factors (columns HML + 
SMB, ICC + SMB and ICC + HML) while the numbers within parenthesis provide the 95% 
confidence interval of the R 2 obtained by bootstrap (Efron and Tibshirani 1993). 

Several comments are in order. First, for the two-factor models - namely the regression 
models which include the market factor and one of the factors ICC, SMB or HML - the 
internal consistency factor ICC provides the best explanation in 11 cases out of 25. Second, 
for the groups of portfolios within the first three quintiles of the distribution of sizes, i.e, 
Small, 2 and 3, the factor ICC provides the largest improvement in 10 cases out of 15. 
Beside, the improvement provided by the factor ICC is particularly important for the group 
of the five portfolios built on the first quintile of the distribution of size (group "Small") 
with respect to both the size and the book-to-market factors. Third, based upon the 95% 
confidence intervals (figures within parenthesis) obtained by bootstrap, this improvement is 
statistically significant with respect to the regression on the sole market factoi0 and also with 
respect to the regression on the market portfolio and cither the size or the book-to-markct 
factor in the group "Small". In contrast, for portfolios belonging to the two last quintiles 
of the distribution of size, i.e., portfolios of the group 4 and Big, the factor HML provides 
the largest improvement 9 times out of 10 and is statistically significant, with respect to the 
regression on the sole market factor, for 8 of these portfolios. 

For the three-factor models, the pair (SMB, HML) provides the best improvement in 13 
cases out of 25, before the pair (ICC, HML) which is the best 8 times out of 25, while the 
pair (SMB, ICC) wins the "horse race" only 4 times out of 25. However, these improvement 
are statically significant with respect to the best two-factor model (which is most often the 
market + factor ICC) in only 5 cases out of 25, namely for the portfolios 2-4, 2-High; 3-4, 
3-High; and 4-Low. Therefore, the usefulness of a three-factor model is clearly questionable. 

To sum up our tests performed on the 25 equally-weighted portfolios ordered by quintiles 
in size and book-to-market, we have found that, on average, the factor ICC alone provides the 
best significant improvement with respect to the market factor, and also provides a significant 
improvement with respect to the market factor and either the size or the book-to-market 
factor. Overall, the addition of one or two of the Fama and French factors turns out to provide 
only a marginal improvement. The confidence intervals on the R 2 obtained by bootstrap 
suggests that a two-factor model (market portfolio + factor ICC) has almost the same 
explanatory power than the three-factor Fama-French model, while being more parsimonious 
and based on solid economic foundation. Beside, the significance of the intercepts a's remains 
comparable (see the last two lines of Table[3|). In all cases, the GRS test (Gibbons et al. 1989) 
underlines that the intercept is significantly different from zero. In this respect, the factor 
ICC does not really improve on the two factors of Fama and French but, clearly, the GRS 
statistics reaches its minimum when the size factor is replaced by the ICC factor. Therefore, 
based on the results on the "Small" group of portfolios, on the GRS test and on our theoretical 
approach, we can finally conclude to the superiority of the factor ICC with respect to the size 
factor SMB. On the hand, the explaining power of the book-to-market factor HML seems 
undisputable even if it is weakened in the presence of ICC. 



[Insert Tables^ and\^ about here] 

1B Note that, a priori, the quoted R 2 of the linear models are not directly comparable since they involve different 
numbers of parameters. In principle, it is thus necessary to use the adjusted- R 2 instead of the raw Ft 2 . However, 
the large number of data points (948) makes the difference between these two quantities irrelevant at the level of 
precision of the first decimal place. 
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The following tables provide the same statistics for value-weighted and equally-weighted 
industry portfolios which confirm the previous conclusions. Table [4] presents the parameter 
estimates of the multi-linear time series regression of the excess monthly returns of 10 value- 
weighed industry portfolios regressed, as in table [21 on the excess return on the market 
portfolio, on the two Fama-French factors and on the factor ICC. In the presence of the risk 
factor ICC, the factor SMB turns out to be not significant for most portfolios (7 cases out of 
10). Conversely, in the presence the factor SMB, the factor ICC has no explanatory power 
in only 4 cases out of 10. This clearly confirms that, overall, ICC is a superior substitute 
to SMB. For the HML factor, the table shows that this factor is always significant, even in 
the presence of the factor ICC. Again, these observations must be tempered by the results 
of table [5] which provides the R 2 of the various multi- linear times series regressions of the 
monthly excess returns of these 10 value-weighed industry portfolios on the same set of 
factors as in table [3] It is striking to observe that, on the basis of the 95% confidence 
intervals obtained by bootstrap, none of the factors ICC, SMB and HML or any combination 
thereof, is able to provide a significant improvement with respect to the regression on the sole 
market factor (with the exception of the portfolio "Others"). Concerning the factor ICC, 
this observation is not a big surprise since it is expected to provide a strong explanatory 
power for well-diversified portfolios. But, by construction, value-weighted portfolios are not 
diversified, hence the lack of explanatory power of the factor ICC. Moreover, if the number 
of assets in each industry is large enough, we should expect that the contribution of the 
residual risk to the total risk goes to zero, as it goes to zero for the market portfolio. 

[Insert Tables\$ and^ about here] 

The situation is totally different when one considers the same set of industry portfo- 
lios but constructed on an equally- weighted basis. In this case, each industry portfolio is 
"well-diversified," in the sense that the weight of each asset in a given industry portfolio is 
inversely proportional to the number of assets in this portfolio. Tables [5] and [7] summarize 
the values of the parameter estimates and of the R 2 , respectively, of the multi- linear time 
series regressions of the excess monthly returns on 10 equally-weighed industry portfolios re- 
gressed, as previously, on the excess return on the market portfolio, on the two Fama-French 
factors and on the factor ICC, on the one hand, and on the same set of factors as in tablesS 
and[5j on the other hand. As in the case of the 25 equally- weighted portfolios sorted by size 
and book-to-market, the addition of the internal consistency factor ICC to the market factor 
provides overall the best improvement in terms of the R 2 of two-factor models. In addition, 
no three- or four-factor model provides a statistically significant improvement while the GRS 
test does not reject the hypothesis of a zero-intercept for the model "Market factor + ICC 
factor" at the 2% level. 

This confirms that the two-factor model constructed with the market portfolio and with 
the internal consistency factor ICC has overall the same explanatory power as the three-factor 
Fama-French model. 

3.4 Relation between the internal consistency factor ICC and the 
two Fama and French factors SMB and HML 

As illustrated above, the additional internal consistency factor allows us to explain several 
well-known pricing anomalies, with a power comparable to the HML + SMB Fama-French 
factors. We now discuss why this can be expected on the basis of our theoretical results. 
Specifically, starting from our theoretical framework, we address the question of why should 
the two additional factors of Fama and French have an explaining power, that is, what could 
be the origins of the size and book-to-market effects. 

The size effect. The size effect is well-known to generally explain the part of the cross- 
section of expected returns left unexplained by any misspecified asset pricing model (Berk 
1995), which raises the question of its relevance as the signature of a genuine risk factor. 
Our theoretical model provides an answer to this question by rationalizing the role of the 
size effect as providing a proxy for the diversification factor / (or ICC). Indeed, since the 
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arbitrage portfolio which proxies the ICC factor is long in the equally-weighted portfolio and 
short in the market portfolio, it is therefore long on the small caps and short on the large 
caps, just like the SMB portfolio. There is thus no qualitative difference between the Fama 
and French's factor SMB and our proxy of the ICC factor. This is confirmed by the large 
value of the linear correlation between the two portfolios proxying the SMB and ICC factors 
equal to 86% over the time interval studied here. As an illustration, the return on each 
factors is depicted on the left panel of figure [3] while the right panel represents the value of 
$1 invested in the market portfolio in Jan. 1927 and the value of a leveraged position of $1 
invested in SMB and ICC in Jan. 1927. 

[Insert figure^ about here] 



The book-to-market effect. As illustrated by Stattman (1980) and Rosenberg et al. 
(1985) in the early eighties and as emphasized more recently by Fama and French (1992, 
1993), stocks with a high book-to-market value tend to overperform stocks with a low book- 
to-market value. Several economic explanations have been proposed to justify this phe- 
nomenon. Among others, Fama and French have proposed that value stocks are companies 
that are in financial distress while Campbell and Vuolteenoha (2004) have suggested that 
growth stocks might have speculative investment opportunities that will be profitable only 
if equity financing is available on sufficiently good terms. 

The pricing formula provided by proposition [3] offers a straightforward justification of the 
book-to-market effect. Indeed, there is good empirical evidence that high book-to-market 
stocks have significantly lower beta's with respect to the market portfolio compared with 
low book-to-market stocks. For instance, using a large sample of firms from 1977 to 2004, 
Bernardo et al. (2007) find that the difference between the beta's of growth opportunities 
and the beta's of assets-in-place is positive and statistically significant, at the 95% level, in 34 
out of 37 industry classifications. Bernardo et al. suggest that this results from the fact that, 
since firms with more growth opportunities have cash flows with longer duration, their values 
are more sensitive to changes in interest rates and thus should have higher beta's. Then, 
ceteris paribus, the additional term (7^ — 7 m ■ /%) • E [ri CC — ro] introduced by the internal 
consistency constraint leads to a higher expected rate of return for a stock with a low beta 
if the term 7 m is positive. 

4 Conclusion 

Starting from a factorial model in which the only a priori systematic risk is the market port- 
folio, we have shown that there is a new source of significant systematic risk, which has been 
totally neglected up to now but which ought to be priced by the market. This occurs when 
(i) the internal consistency condition holds (which simply means that the market portfolio 
is constituted of the assets whose returns it is supposed to explain) and (ii) the distribution 
of the capitalization of firms is sufficiently fat-tailed, as is the case of real economies. The 
corresponding new internal consistent factors do not disappear for arbitrary large economics 
because the contribution, to the risk of arbitrary well-diversified portfolios due to the largest 
firms, remains finite for arbitrary large economies when the distribution of the capitalization 
of firms is sufficiently heavy-tailed. For this reason, this endogenous factor can be consid- 
ered as related to the existence of a diversification/concentration premium resulting from 
the concern of investors with respect to the level of diversification of their portfolio in so far 
as holding the market portfolio alone does not allow for a good diversification. 

Applied to the Arbitrage Pricing Theory, we have shown that the original derivation of 
Ross' results still holds, provided that we explicitly include the additional diversification 
factor in the analysis. As a consequence, this factor is shown to provide possible theoretical 
economic explanations of some of the empirical factors reported in the literature. In par- 
ticular, it allows understanding the superior performance of Fama and French three-factor 
model in explaining the cross section of stock returns. Indeed, the diversification factor pro- 
vides a rationalization of the SMB factor as a proxy of this factor. Beside, being consistent 
with the fact that high book-to-market stocks have significantly lower beta's with respect to 
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the market portfolio compared with low book-to-market stocks, the Value/Growth effect is 
related to the increasing sensitivity of value stocks to the diversification factor. Finally, on 
the basis of only two factors (the market portfolio and the equally- weighted portfolio), our 
model turns out to be at least as successful as the Fama and French three-factor model in 
explaining the cross-section of monthly returns on US stock over the time period for Jan. 
1927 to Dec 2005. 
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Table 1 
Numerical simulations 



Average, minimum and maximum value of the R 2 of the regression of the return of 20 equally weighted portfolios (randomly drawn from a market 
of N = 1000 and N = 10000 assets according to the model (|19p ) on the market portfolio (r m ), on the market portfolio and the internal consistency 
factor (r m ,f), on the market portfolio and the (overall) equally weighted portfolio (r m ,r e ), on the market portfolio and an under-diversified 
portfolio (r m ,r u ) and on the market portfolio and a well-diversified arbitrage portfolios (r m ,r a ). Different market situations are considered with 
distributions of firm sizes with tail index [i which varies from 0.5 to 2. 
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Table 2 

Multi-factor time series regressions for monthly excess returns on 25 
equally- weighted portfolios sorted by size and book-to- market (beta): 
Jan. 1927 - Dec. 2005, 948 months 

Parameter estimates of the linear regression of the excess returns on 25 equally-weighed portfolios 
(sorted by quintiles of the distribution of size - Small, 2, 3, 4 and Big - and by quintiles of the 
distribution of Book equity to Market equity ratio - Low, 2, 3, 4 and High) regressed on the 
excess return on the market portfolio, on the two Fama-French factors SMB and HML and on 
the proxy for the additional risk factor due to the internal consistency constraint given by the 
difference between the return on the equally-weighted portfolio and the return on the market 
portfolio: 

n,t -r = ai + Pi- [r m (t) - r ] + /?/ cc • r icc (t) + f3? MB ■ r smb (t) + &? ML ■ r hml (t) + £i (t). 

In the four columns labeled ft, (5 SMB , f3 HML and j3 the figures decorated by one star (resp. 
two stars) show the cases which reject the null hypothesis that the factor under consideration 
is not significant in the presence of the others at the 5% (resp. the 1%) level. For instance, for 
the portfolio Big-High, the factor SMB is not significant (neither at the 5% nor the 1% level) 
in the presence of both the market factor, the factor HML and the proxy for the factor ICC. 
Similarly, the factor ICC is not significant in the presence of the market factor, the SMB and 
HML factors while, in contrast, the factor HML is still significant at the 1% level in the presence 
of the market factor, the the SMB and ICC factors. 
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Table 3 

Multi-factor time series regressions for monthly excess returns on 25 equally-weighted portfolios sorted by size and 

book-to-market (R 2 ): Jan. 1927 - Dec. 2005, 948 months 

R 2 of the linear regression of the excess returns of 25 equally-weighed portfolios (sorted by quintiles of the distribution of size - Small, 2, 3, 4 
and Big - and by quintiles of the distribution of Book equity to Market equity ratio - Low, 2, 3, 4 and High) on the market portfolio (Rm), 
on the market portfolio and the factor ICC (ICC), on the market portfolio and the size factor (SMB), on the market portfolio and the book to 
market factor (HML), on the market portfolio and the two Fama&French factors (HML + SMB), on the market portfolio, the factor ICC and 
the size factor (ICC + SMB), on the market portfolio, the factor ICC and the book to market factor (ICC + HML) and, finally on all these four 
factors (Market, ICC, SMB and HML). Figures in boldface represent the maximum value of the R 2 within the group of regression with two factors 
(columns ICC, SMB and HML) and with three factors (columns HML + SMB, ICC + SMB and ICC + HML). The two last rows reports Gibbons 
et al. (1989) test statistics and p-values. 
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92.6% 

(91. 4%, 94.0%) 


2 


89.4% 

(87.1%, 91. 5%) 


91.3% 

(89.0%, 93. 3%) 


90.8% 

(88. 2%, 93.1%) 


90.0% 

(88.0%, 91. 9%) 


91.4% 

(89.3%, 93.5%) 


91.3% 

(89.1%, 93. 4%) 


91.4% 

(89. 3%, 93. 4%) 


91.5% 

(89. 4%, 93. 5%) 


A Q 

4 o 


O ( .0/0 

(84.5%, 89. 8%) 


on 90/ 

yu.o/o 

(87.5%, 92. 5%) 


oo.y /o 

(86. 2%, 91. 5%) 


on so? 

yu.o /o 

(88. 5%, 92. 6%) 


00 no/ 

yz.u /o 

(89. 8%, 94.0%) 


nn ko/ 

yu.o/o 

(87. 8%, 92. 7%) 


m no/ 

yi.y /o 

(89. 7%, 93. 8%) 


no no/ 
yz.U/o 

(89. 8%, 94.0%) 


4 


CO £0/ 

82.070 

(78. 6%, 85. 7%) 


C£? £?0/ 

86.6/0 

(82. 6%, 89. 8%) 


QQ CO/ 
8O.5/0 

(79. 9%, 87.1%) 


Ol {'/ 

yi.87o 

(89. 2%, 93.9%) 


no TO/ 

yz. / % 

(90. 3%, 94. 6%) 


CO 1 0/ 
88.I70 

(84.1%, 91.1%) 


OO Q 0/ 

(90. 2%, 94. 6%) 


no q0/ 
y2.8/o 

(90. 3%, 94. 7%) 


High 


TA AO/ 

7 4.4/0 

(69. 6%, 78. 9%) 


CO 1 0/ 

82.1/0 

(77.6%, 85. 9%) 


rjn oG/ 

/6.6/0 

(72.0%, 81.1%) 


on "7* / 

(87.6%, 93.2%) 


no so/ 
yz.O/o 

(89. 9%, 94. 5%) 


C A £0/ 

84. 0/0 

(79. 7%, 88.7%) 


OO £50/ 

(90.0%, 94. 6%) 


no 7^0/ 

y2. l/o 

(90.1%, 94. 6%) 


Low 


92.0% 

(90. 5%, 93. 3%) 


92.5% 

(91.0%, 93. 8%) 


92.2% 

(90. 7%, 93. 5%) 


95.1% 

(94.0%, 96.1%) 


95.2% 

(94. 2%, 96. 2%) 


92.7% 

(91.1%, 94.1%) 


95.1% 

(94.0%, 96.1%) 


95.5% 

(94. 6%, 96. 4%) 


2 


93.3% 

(91.0%, 94.9%) 


93.3% 

(91.0%, 95.0%) 


93.5% 

(91. 5%, 95.0%) 


93.7% 

(91. 6%, 95. 3%) 


93.9% 

(92.0%, 95. 4%) 


93.9% 

(92.1%, 95. 4%) 


93.7% 

(91. 8%, 95. 3%) 


94.0% 

(92. 2%, 95. 5%) 


Bie 3 


88.2% 

(85.0%, 90. 6%) 


88.3% 

(85.1%, 90. 9%) 


88.4% 

(85. 6%, 90. 9%) 


92.3% 

(90.0%, 94.2%) 


92.7% 

(90. 5%, 94. 5%) 


90.6% 

(87. 9%, 93.0%) 


92.5% 

(90. 3%, 94. 3%) 


92.7% 

(90. 5%, 94.6%) 


4 


79.0% 

(74.3%, 82. 9%) 


80.5% 

(75. 8%, 84.5%) 


79.1% 

(74. 7%, 83.0%) 


91.9% 

(89.2%, 94.0%) 


92.0% 

(89. 3%, 94.1%) 


86.0% 

(81. 5%, 89.6%) 


91.9% 

(89. 2%, 94.0%) 


92.2% 

(89. 6%, 94. 3%) 


High 


70.1% 

(64.1%, 75. 2%) 


72.6% 

(66. 9%, 77.2%) 


70.1% 

(64. 4%, 75. 3%) 


86.2% 

(82. 4%, 89. 9%) 


86.2% 

(82. 5%, 89. 9%) 


78.5% 

(72. 7%, 83.3%) 


86.3% 

(82. 5%, 89. 9%) 


86.5% 

(82. 8%, 90.1%) 


Average 


76.1% 

(72. 6%, 79. 7%) 


87.3% 

(85. 3%, 89. 3%) 


84.7% 

(82. 4%, 87.2%) 


82.2% 

(79. 4%, 85. 3%) 


90.6% 

(89.1%, 92. 2%) 


88.6% 

(86. 7%, 90. 6%) 


90.8% 

(89. 5%, 92. 2%) 


91.4% 

(90. 2%, 92. 8%) 


GRS 


4.37 


4.11 


4.41 


4.02 


4.07 


4.19 


3.92 


4.06 


p- value 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 



Table 4 

Multi-factor time series regressions for monthly excess returns on 10 
value-weighted industry portfolios (beta): Jan. 1927 - Dec. 2005, 948 

months 

Parameter estimates of the linear regression of the excess returns of ten value-weighed industry 
portfolios regressed on the excess return on the market portfolio, on the two Fama-French factors 
SMB and HML and on the proxy for the additional risk factor due to the internal consistency 
constraint given by the difference between the return on the equally-weighted portfolio and the 
return on the market portfolio: 

n,t -r = ai + (3i- [r m (t) - r ] + • r icc (t) + f t MB ■ r smb (t) + /?f ML ■ r hml {t) + £i (t). 

In the four columns labeled j3, (5 SMB , p HML and P ICC , the figures decorated by one star (resp. 
two stars) show the cases which reject the null hypothesis that the factor under consideration is 
not significant in the presence of the others at the 5% (resp. the 1%) level. For instance, for the 
Shops, the Health and the Utilities industries, the factor SMB is not significant (neither at the 
5% nor the 1% level) in the presence of both the market factor, the factor HML and the proxy 
for the factor ICC. Similarly, for the same industry portfolio, the factor ICC is not significant 
in the presence of the market factor and the SMB and HML factors. 



Industry 


a 


(3 








K 1 


Consumer Non Durables 


0.0019 


0.78** 


0.08 


0.07* 


-0.15** 


78% 


Consumer Durables 


-0.0006 


1.11** 


-0.12 


0.11* 


0.24** 


75% 


Manufacturing 


-0.0006 


1.10** 


0.11** 


0.19** 


-0.11* 


92% 


Energy 


0.0018 


0.86** 


-0.10 


0.30** 


-0.16 


64% 


Business Equipment 


0.0012 


1.27** 


-0.21** 


-0.45** 


0.37** 


84% 


Telecom 


0.0015 


0.69** 


-0.27** 


-0.15** 


0.18** 


63% 


Shops 


0.0010 


0.96** 


0.03 


-0.14** 


0.06 


80% 


Health 


0.0030 


0.91** 


-0.01 


-0.15** 


-0.11 


68% 


Utilities 


-0.0001 


0.79** 


-0.05 


0.35** 


-0.11 


63% 


Others 


-0.0016 


1.06** 


-0.07 


0.30** 


0.12** 


92% 
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Table 5 

Multi-factor time series regressions for monthly excess returns on 10 value-weighted industry portfolios (R 2 ): Jan. 

1927 - Dec. 2005, 948 months 

R 2 of the linear regressions of the excess returns of ten value-weighed industry portfolios regressed on the (excess return) on the market portfolio 
(Rm), on the market portfolio and the factor ICC (ICC), on the market portfolio and the size factor (SMB), on the market portfolio and the book 
to market factor (HML), on the market portfolio and the two Fama&French factors (HML + SMB), on the market portfolio, the factor ICC and 
the size factor (ICC + SMB), on the market portfolio, the factor ICC and the book to market factor (ICC + HML) and, finally on the four factors 
(market, ICC, SMB, HML). Figures in boldface represent the maximum value of the R 2 within the group of regressions with two factors (columns 
ICC, SMB and HML) and with three factors (columns HML + SMB, ICC + SMB and ICC + HML). The two last rows reports Gibbons et al. 
(1989) test statistics and p-values. 





Rm 


ICC 


SMB 


HML 


XJ1\/TT 

SMB 


TPP 

SMB 


TPP 

HML 


All 

four 
factors 


(Hnnsninpr IMon 


77.4% 


77 6% 


77.5% 


77.5% 


77.5% 


77.6% 


77.7% 


77.7% 


Durables 


(72. 6%, 81. 4%) 


(72. 8%, 81. 7%) 


(72. 7%, 81. 5%) 


(72. 8%, 81. 6%) 


(73.0%, 81. 7%) 


(72. 9%, 81. 7%) 


(73.0%, 81. 7%) 


(73.1%, 81. 9%) 


Consumer Durables 


74.0% 


74.6% 


74.2% 


74.8% 


74.9% 


75.0% 


ft/ 

75.1% 


75.1% 


(69.0%, 78. 4%) 


(69. 8%, 78. 9%) 


(69. 2%, 78. 5%) 


(69. 9%, 79. 2%) 


(70.1%, 79.3%) 


(70. 2%, 79. 4%) 


(70. 3%, 79. 4%) 


(70. 4%, 79. 5%) 


Manufacturing 


91.6% 


91.7% 


91.6% 


92.3% 


92.3% 


91.7% 


92.3% 


92.4% 


















Energy 


60.1% 


60.4% 


61.4% 


62.2% 


63.6% 


61.9% 


63.7% 


63.7% 


(53.9%, 65. 3%) 


(54. 2%, 65. 8%) 


(55. 5%, 66. 7%) 


(56.0%, 67.2%) 


(58.2%, 68. 6%) 


(56.0%, 67.2%) 


(58. 2%, 68. 6%) 


(58.3%, 68.7%) 


Business Equipment 


81.3% 

(77. 3%, 84.6%) 


81.3% 

(77. 5%, 84.6%) 


81.4% 

(77.9%, 84.7%) 


83.6% 

(80. 7%, 86. 2%) 


83.8% 

(80. 9%, 86. 4%) 


81.6% 

(78.1%, 84.9%) 


84.0% 

(81. 3%, 86. 6%) 


84.2% 

(81. 6%, 86.8%) 


Telecom 


61.4% 


62.0% 


62.2% 


62.0% 


62.7% 


62.2% 


62.3% 


63.0% 


(56.0%, 66. 5%) 


(57. 2%, 67.0%) 


(57.3%, 67.2%) 


(56. 9%, 67.0%) 


(58.1%, 67. 6%) 


(57.5%, 67.3%) 


(57.6%, 67.2%) 


(58.5%, 67.9%) 


Shops 


78.9% 


78.9% 


79.1% 


79.4% 


79.5% 


79.2% 


79.6% 


79.6% 


(74. 5%, 82. 7%) 


(74. 6%, 82. 7%) 


(74. 8%, 83.1%) 


(75. 3%, 83.1%) 


(75. 5%, 83.4%) 


(75.0%, 83. 2%) 


(75. 5%, 83. 3%) 


(75. 6%, 83. 4%) 


Health 


66.0% 


67.0% 


66.4% 


67.3% 


67.7% 


67.2% 


67.7% 


67.7% 


(59.1%, 71. 3%) 


(60. 8%, 72. 4%) 


(59. 8%, 71. 7%) 


(61.0%, 72. 5%) 


(61. 6%, 72. 9%) 


(61. 5%, 72. 7%) 


(61. 8%, 73.1%) 


(62. 2%, 73. 2%) 


Utilities 


58.5% 


58.5% 


59.1% 


62.2% 


62.9% 


60.1% 


62.9% 


62.9% 


(51.0%, 65. 3%) 


(51.0%, 65. 5%) 


(51. 8%, 65. 6%) 


(55. 3%, 68. 7%) 


(56.1%, 69.1%) 


(52. 9%, 66. 8%) 


(56. 4%, 69. 2%) 


(56. 5%, 69.2%) 


Others 


88.4% 


89.2% 


88.4% 


91.7% 


91.7% 


90.2% 


91.8% 


91.8% 


(86. 5%, 90. 2%) 


(87. 2%, 91.1%) 


(86. 5%, 90. 4%) 


(89. 9%, 93. 6%) 


(89. 9%, 93. 6%) 


(88. 4%, 92. 2%) 


(90.0%, 93. 6%) 


(90.0%, 93. 7%) 


Average 


73.7% 


74.1% 


74.1% 


75.2% 


75.6% 


74.6% 


75.6% 


75.7% 


(70. 2%, 76. 8%) 


(70.8%, 77. 2%) 


(70. 8%, 77.2%) 


(72.1%, 78. 2%) 


(72. 7%, 78. 6%) 


(71. 4%, 77.8%) 


(72. 6%, 78. 6%) 


(72. 9%, 78. 7%) 


GRS 


2.67 


3.23 


3.16 


3.32 


3.69 


3.09 


3.63 


3.65 


p- value 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 



Table 6 

Multi-factor time series regressions for monthly excess returns on 10 
equally- weighted industry portfolios (beta): Jan. 1927 - Dec. 2005, 948 

months 

Parameter estimates of the linear regression of the excess returns of ten equally-weighed industry 
portfolios regressed on the excess return on the market portfolio, on the two Fama-French factors 
SMB and HML and on the proxy for the additional risk factor due to the internal consistency 
constraint given by the difference between the return on the equally-weighted portfolio and the 
return on the market portfolio: 

Ti,t -r = ai + (3i- [r m (t) - r ] + & cc ■ r icc {t) + (if MB ■ r smb (t) + /?f ML ■ r hml {t) + e l (t). 

In the four columns labeled /3, /3 SMB , (3 HML and (3 ICC , the figures decorated by one star (resp. 
two stars) show the cases which reject the null hypothesis that the factor under consideration 
is not significant in the presence of the others at the 5% (resp. the 1%) level. 



Industry 


a 


P 






pica 




Consumer Non Durables 


-0.0003 


0.84** 


0.08* 


0.10** 


0.77** 


94% 


Consumer Durables 


-0.0024 


1.12** 


0.21** 


0.07* 


0.97** 


92% 


Manufacturing 


-0.0004 


1.07** 


0.12** 


0.17** 


0.76** 


97% 


Energy 


0.0019 


0.95** 


0.13 


0.34** 


0.55** 


69% 


Business Equipment 


0.0016 


1.22** 


-0.29** 


-0.65** 


1.52** 


92% 


Telecom 


0.0030 


0.92** 


-0.30** 


-0.54** 


0.98** 


73% 


Shops 


0.0000 


0.91** 


0.11* 


-0.11** 


0.93** 


90% 


Health 


0.0037 


0.91** 


-0.04 


-0.54** 


0.92** 


80% 


Utilities 


0.0006 


0.85** 


0.21* 


0.55** 


-0.06 


66% 


Others 


-0.0008 


0.95** 


0.07 


0.39** 


0.93** 


95% 
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Table 7 

Multi-factor time series regressions for monthly excess returns on 10 equally-weighted industry portfolios (R 2 ): 

Jan. 1927 - Dec. 2005, 948 months 

R? of the linear regression of the excess returns of ten equally-weighed industry portfolios regressed on the (excess return) on the market portfolio 
(Rm), on the market portfolio and the factor ICC (ICC), on the market portfolio and the size factor (SMB), on the market portfolio and the 
book to market factor (HML), on the market portfolio and the two Fama&French factors (HML + SMB), on the market portfolio, the factor ICC 
and the size factor (ICC + SMB), on the market portfolio, the factor ICC and the book to market factor (ICC + HML) and, finally on the four 
factors (market, ICC, SMB and HML). Figures in boldface represent the maximum value of the R 2 within the group of regression with two factors 
(columns ICC, SMB and HML) and with three factors (columns HML + SMB, ICC + SMB and ICC + HML). The two last rows reports Gibbons 
et al. (1989) test statistics and p-values. 





Rm 


ICC 


SMB 


HML 


niviij 
SMB 


SMB 


HML 


A 11 

four 
factors 


Consumer Non 


75.97c 


94.1to 


OO AO/ 

88.4% 


79.77o 


91.8% 


94.1% 


94. c$7o 


n A oOX 

94.3% 


Durables 


(70.9%, 80.5%) 


(92. 4%, 95. 5%) 


(85. 2%, 91. 2%) 


(74. 9%, 83. 9%) 


(89. 5%, 93. 9%) 


(92. 5%, 95. 5%) 


(92. 7%, 95. 6%) 


(92. 7%, 95. 7%) 


("InnmiTYipr Diiral~>1ps 

V_,' L/llO LL±11L 1 1 J Llia I J1L O 


74.4% 


92.3% 


87.9% 


76.9% 


90.2% 


92.4% 


92.3% 


92.4% 


(69. 2%, 79. 2%) 


(90. 2%, 94. 2%) 


(84. 8%, 91.1%) 


(72. 2%, 81. 9%) 


(87. 6%, 92. 6%) 


(90. 3%, 94. 3%) 


(90. 2%, 94. 2%) 


(90. 4%, 94. 3%) 


Manufacturing 


82.2% 


96.7% 


92.0% 


85.9% 


95.4% 


96.8% 


97.0% 


97.1% 


(78.3%, 86.0%) 


(95. 7%, 97. 6%) 


(89. 9%, 93.9%) 


(82. 5%, 88. 9%) 


(93. 9%, 96. 6%) 


(95. 7%, 97.6%) 


(96.1%, 97.8%) 


(96. 2%, 97.9%) 


Energy 


58.3% 


67.8% 


63.7% 


63.4% 


68.5% 


68.1% 


69.3% 


69.3% 


(51. 7%, 64. 5%) 


(61. 9%, 73. 7%) 


(57. 6%, 69.9%) 


(58. 6%, 68. 5%) 


(63.0%, 74.1%) 


(62. 3%, 74.0%) 


(64.0%, 74. 7%) 


(64.0%, 74. 8%) 


Business Equipment 


74.5% 

(68. 7%, 79. 9%) 


87.4% 

(85.0%, 89. 8%) 


86.2% 

(82. 5%, 89.4%) 


74.8% 

(69. 3%, 80.1%) 


86.6% 

(83. 2%, 89. 6%) 


88.0% 

(85. 9%, 90. 4%) 


91.6% 

(90.1%, 93.0%) 


91.8% 

(90.4%, 93. 2%) 


Telecom 


62.7% 


68.2% 


68.1% 


63.9% 


69.4% 


68.6% 


72.6% 


73.0% 


(55. 4%, 69. 2%) 


(64.0%, 72. 8%) 


(61. 4%, 74.0%) 


(56. 5%, 70. 5%) 


(63. 4%, 75.1%) 


(64.2%, 74. 2%) 


(69.0%, 77.0%) 


(69. 5%, 77. 3%) 


Shops 


71.8% 


90.1% 


86.7% 


72.8% 


87.6% 


90.3% 


90.4% 


90.5% 


(66. 7%, 77.0%) 


(86.2%, 93.1%) 


(82. 8%, 90.5%) 


(67.7%, 78. 2%) 


(83. 6%, 91. 2%) 


(86. 6%, 93. 4%) 


(87.1%, 93.3%) 


(87.1%, 93.4%) 


Health 


65.1% 


74.5% 


75.9% 


66.4% 


77.4% 


76.2% 


80.5% 


80.5% 


(58. 2%, 71. 3%) 


(69.5%, 79.0%) 


(72.0%, 79. 7%) 


(60.0%, 72. 5%) 


(73. 7%, 81.0%) 


(72. 8%, 79. 9%) 


(77. 3%, 83. 7%) 


(77. 4%, 83.8%) 


Utilities 


58.3% 


60.8% 


58.9% 


65.9% 


66.5% 


61.7% 


66.3% 


66.5% 


(51.0%, 65. 3%) 


(52. 8%, 68.8%) 


(51. 6%, 66. 8%) 


(58. 2%, 72. 8%) 


(58. 3%, 74.0%) 


(53. 8%, 69. 5%) 


(58. 3%, 73.6%) 


(58. 5%, 74.1%) 


Others 


71.9% 


92.8% 


83.6% 


81.6% 


92.7% 


93.4% 


95.2% 


95.2% 


(66.1%, 77. 2%) 


(90.5%, 94.8%) 


(79.3%, 87. 5%) 


(77.5%, 85. 3%) 


(90.1%, 94. 9%) 


(91.1%, 95. 3%) 


(93. 4%, 96. 6%) 


(93. 5%, 96. 6%) 


Average 


69.5% 


82.4% 


79.1% 


73.1% 


82.6% 


82.9% 


84.9% 


85.0% 


(65.0%, 73. 9%) 


(79. 9%, 85.1%) 


(76.0%, 82. 3%) 


(69.0%, 77.1%) 


(79. 9%, 85. 4%) 


(80. 5%, 85. 6%) 


(82. 9%, 87. 2%) 


(83.0%, 87. 3%) 


GRS 


2.53 


2.21 


2.69 


2.72 


2.61 


2.24 


2.70 


2.70 


p- value 


0.01 


0.02 


0.00 


0.00 


0.00 


0.01 


0.00 


0.00 
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Figure 1. Concentration of the market portfolio. The upper panel shows the 
weight of the largest firms in the market portfolio as a function of the tail index \x 
of the Pareto distribution of firm sizes. The lower panel shows the Herfindahl index 
of the market portfolio as a function of the tail index fi of the Pareto distribution of 
firm sizes. In both cases, the continuous line provides the values in the limit of an 
infinite economy while the dotted and dash-dotted curves refers to the cases of an 
economy with one thousand and ten thousand firms respectively. 
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Probability (p) 

Figure 2. Contribution of the residual variance to the total variance. 

The figure shows the probability p to reach or exceed a given contribution level, in 
percentage, of the residual variance to the total variance of the return on the equally- 
weighted portfolio in a market with 7000-8000 traded assets and with a distribution 
of firm sizes given by Zipf's law (fi = 1). 
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Figure 3. Comparison of ICC and SMB. The upper panel shows the return 
of the factor SMB versus the return of the factor ICC. The straight line shows the 
regression line with equation y = —0.0008 + 0.8292 • x. The lower panel depicts 
the value of $1 invested in the market portfolio in Jan. 1927 (grey curve; green 
online) and the value of a leveraged position of $1 invested in SMB (dark grey curve; 
blue online) and ICC (black curve; red online) in Jan. 1927. For the two arbitrage 
portfolios SMB and ICC, the initial endowment of $1 can be thought of as a reserve 
to ensure against risk losses, from which the returns can be discounted to provide 
the shown wealth curves. 
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